Compositions for increasing polypeptide stability and activity, and related methods

ABSTRACT

This disclosure provides peptides, polypeptides, fusion polypeptides, compositions, and methods for enhancing or increasing the stability of a polypeptide (e.g., Taq polymerase). Such peptides, polypeptides, fusion polypeptides, or compositions include polypeptides linked to a peptide tag that enhances the stability of the polypeptide. The peptides, polypeptides, fusion polypeptides, compositions may also enhance the activity, specificity, and/or fidelity of other polypeptides in a reaction mixture. The disclosure also provides methods of using such peptides, polypeptides, fusion polypeptides, compositions.

CROSS-REFERENCE TO RELATED APPLICATIONS

This application is a continuation application of U.S. application Ser.No. 15/068,477, filed Mar. 11, 2016, which is a continuation applicationof U.S. application Ser. No. 12/950,349, filed Nov. 19, 2010, whichclaims the benefit of U.S. Provisional Application No. 61/262,919, filedNov. 19, 2009, U.S. Provisional Application No. 61/350,457, filed Jun.1, 2010, U.S. Provisional Application No. 61/356,541, filed Jun. 18,2010, and U.S. Provisional Application No. 61/390,857, filed Oct. 7,2010, all of which are incorporated herein by reference in theirentirety.

BACKGROUND OF THE INVENTION

There is a need in the art for methods and compositions that enhance thestability of proteins. There is particularly a need for compositionsthat enhance the stability of polymerases such as Taq polymerases sothat they can retain enzymatic activity after short-term or long-termexposure to temperatures above freezing. There is also a need in the artfor compositions that enhance polymerase fidelity, sensitivity, andyield.

SUMMARY OF THE INVENTION

This disclosure provides peptides, polypeptides, fusion polypeptides,compositions, and methods for enabling the retention of activity of anenzyme (e.g., DNA polymerase, RNA polymerase, nuclease, reversetranscriptase, DNA deaminase, RNA deaminase, protease) or a protein(e.g., erythropoietin, human Leukemia Inhibitor Factor (hLIF),granulocyte macrophage colony-stimulating factor (GM-CSF), insulin,vascular endothelial growth factor (VEGF), leptin, bevacizumab) aftershort-term or long-term exposure to a temperature of from about −20° C.to about 35° C. In some embodiments, peptides, polypeptides, fusionpolypeptides, or compositions provided herein enhance stability of anenzyme or protein at room temperature. In some embodiments, an enzyme orprotein provided herein is any nucleic acid binding protein, e.g., a DNAbinding protein, a RNA binding protein, a fragment thereof, or anycombination thereof. In some embodiments, an enzyme or protein providedherein binds to other proteins, e.g., hormone receptors.

In some embodiments, polypeptides, fusion polypeptides, or compositionsprovided herein retain activity at a temperature between −20° C. and 50°C. In some embodiments, polypeptides, fusion polypeptides, orcompositions retain enzymatic activity or hormone activity at atemperature between −20° C. and 50° C. In some embodiments, theenzymatic activity or hormone activity of the polypeptides, fusionpolypeptides, or compositions after exposure to a temperature between−20° C. and 50° C. is at least 50% of the enzymatic activity of thepolypeptide prior to exposure to said temperature.

In some embodiments, peptide tags provided herein increase stability ofthe polypeptides, fusion polypeptides, or compositions. In someembodiments, peptide tags stabilize the polypeptides, fusionpolypeptides, or compositions. In some embodiments, peptide tags inhibitloss of enzymatic activity of the polypeptides, fusion polypeptides, orcompositions. In some embodiments, peptide tags inhibit degradation ofthe polypeptides, fusion polypeptides, or compositions. In someembodiments, peptide tags increase stability or inhibit loss ofenzymatic activity of the polypeptides, fusion polypeptides, orcompositions for at least one day. In some embodiments, peptide tagsincrease stability or inhibit loss of enzymatic activity of thepolypeptides, fusion polypeptides, or compositions for at least oneweek. In some embodiments, peptide tags increase stability or inhibitloss of enzymatic activity of the polypeptides, fusion polypeptides, orcompositions for at least one month.

In some embodiments, polypeptides, fusion polypeptides, or compositionsprovided herein demonstrate enhanced stability, enzymatic activity, orhormone activity when compared to a similar polypeptide that does notcomprise a peptide tag provided herein. In some embodiments,polypeptides, fusion polypeptides, or compositions have at least 50%,60%, 70%, 80%, 90%, or 95% of the enzymatic activity or hormone activityof a similar polypeptide that does not comprise a peptide tag providedherein. In some embodiments, polypeptides, fusion polypeptides, orcompositions have at least 50%, 60%, 70%, 80%, 90%, or 95% of theenzymatic activity or hormone activity of a similar polypeptide thatdoes not comprise a peptide tag provided herein, after exposure to atemperature between −20° C. and 50° C. In some embodiments,polypeptides, fusion polypeptides, or compositions have at least 5%,10%, 15%, 20%, 25%, 30%, 40%, 50%, 60%, 70%, 80%, 90%, 100%, or 200%greater enzymatic activity or hormone activity than the enzymaticactivity or hormone activity of a similar polypeptide that does notcomprise a peptide tag provided herein, after exposure to a temperaturebetween −20° C. and 50° C.

In some embodiments, polypeptides, fusion polypeptides, or compositionsprovided herein comprise a peptide tag that has an amino acid sequencethat is at least 70% identical to SEQ ID NO: 1. In some embodiments,polypeptides, fusion polypeptides, or compositions provided hereincomprise a peptide tag that has an amino acid sequence that is 50% to98% identical to SEQ ID NO: 1. In some embodiments, the peptide tag hasan amino acid sequence as shown in SEQ ID NO: 1, SEQ ID NO: 13, or SEQID NO: 14. In some embodiments, the peptide tag has an amino acidsequence that is at least 70% identical to SEQ ID NO: 13 or SEQ ID NO:14. In some embodiments, polypeptides, fusion polypeptides, orcompositions provided herein comprise an amino acid sequence that is atleast 70% identical to SEQ ID NO: 6, SEQ ID NO: 8, or SEQ ID NO: 11. Insome embodiments, polypeptides, fusion polypeptides, or compositionsprovided herein comprise an amino acid sequence as shown in SEQ ID NO:6, SEQ ID NO: 8, or SEQ ID NO: 11. In some embodiments, the fusionpolypeptide comprises an amino acid sequence that is at least 70%identical to SEQ ID NO: 1, SEQ ID NO: 6, SEQ ID NO: 8, SEQ ID NO: 11,SEQ ID NO: 13, SEQ ID NO: 14, SEQ ID NO: 16, SEQ ID NO: 18, SEQ ID NO:20, SEQ ID NO: 22. In some embodiments, polypeptides, fusionpolypeptides, or compositions do not have the amino acid sequence asshown in SEQ ID NO: 2, SEQ ID NO: 6, or SEQ ID NO: 11.

In some embodiments, polypeptides, fusion polypeptides, or compositionsprovided herein comprise a peptide tag that is encoded by a nucleotidesequence that is at least 70% identical to SEQ ID NO: 3 or SEQ ID NO:15. In some embodiments, polypeptides, fusion polypeptides, orcompositions provided herein comprise a peptide tag that is encoded by anucleotide sequence as shown in SEQ ID NO:3 or SEQ ID NO: 15. In someembodiments, polypeptides, fusion polypeptides, or compositions providedherein comprise a polypeptide encoded by a nucleotide sequence that isat least 70% identical to SEQ ID NO: 4, SEQ ID NO: 5, SEQ ID NO: 7, orSEQ ID NO: 12. In some embodiments, polypeptides, fusion polypeptides,or compositions provided herein comprise a polypeptide encoded by anucleotide sequence as shown in SEQ ID NO: 4, SEQ ID NO: 5, SEQ ID NO:7, or SEQ ID NO: 12.

In some embodiments, polypeptides, fusion polypeptides, or compositionsprovided herein comprise a polypeptide that has an amino acid sequencethat is at least 70% identical to SEQ ID NO: 10. In some embodiments,polypeptides, fusion polypeptides, or compositions comprise apolypeptide that has an amino acid sequence as shown in SEQ ID NO: 10.In some embodiments, polypeptides, fusion polypeptides, or compositionscomprise a polypeptide comprising a sequence motif that binds to adouble stranded DNA. In some embodiments, polypeptides, fusionpolypeptides, or compositions comprise a polypeptide that is encoded bya nucleotide sequence that is at least 70% identical to SEQ ID NO: 9. Insome embodiments, polypeptides, fusion polypeptides, or compositionscomprise a polypeptide that is encoded by a nucleotide sequence as shownin SEQ ID NO: 9. In some embodiments, the fusion polypeptide comprises apolypeptide encoded by SEQ ID NO: 3, SEQ ID NO: 5, SEQ ID NO: 7, SEQ IDNO: 12, SEQ ID NO: 15, SEQ ID NO: 17, SEQ ID NO: 19, SEQ ID NO: 21, orSEQ ID NO: 23.

In some embodiments, fusion polypeptides provided herein comprise apeptide tag that has an amino acid sequence that is 50% to 98% identicalto SEQ ID NO: 1 and at least one polypeptide, wherein the peptide tag islinked to said at least one polypeptide, and the peptide tag stabilizesthe fusion polypeptide at a temperature between −20° C. and 50° C. Inone embodiment, the peptide tag is covalently linked to the at least onepolypeptide. In one embodiment, the peptide tag is non-covalently linkedto the at least one polypeptide. In one embodiment, the peptide tag islinked to the amino-terminus of the at least one polypeptide. In oneembodiment, the peptide tag is linked to the carboxy-terminus of the atleast one polypeptide. In some embodiments, the fusion polypeptidecomprises a first polypeptide and a second polypeptide. In oneembodiment, the first polypeptide is an enzyme and the secondpolypeptide is a double strand binding protein. In one embodiment, thepeptide tag is linked to the amino-terminus of the first polypeptide,and the carboxy-terminus of the first polypeptide is linked to theamino-terminus of the second polypeptide. In one embodiment, the peptidetag is linked to the amino-terminus of the second polypeptide, and thecarboxy-terminus of the second polypeptide is linked to theamino-terminus of the first polypeptide.

In one aspect, this disclosure provides a polypeptide, fusionpolypeptide, or composition comprising a peptide tag linked to apolypeptide, wherein said polypeptide retains an enzymatic activityafter exposure to a temperature of at least about −10° C. to about 50°C., and wherein said fusion polypeptide does not have the amino acidsequence of SEQ ID NO: 2.

In another aspect, this disclosure provides a composition comprising:(a) a fusion polypeptide comprising a first polypeptide linked to apeptide, wherein said fusion polypeptide retains an enzymatic activityafter exposure to a temperature of at least about −10° C. to about 50°C.; and (b) a second polypeptide.

In some embodiments, the peptide is covalently linked to saidpolypeptide, said first polypeptide or said second polypeptide. In someembodiments, the peptide is non-covalently linked to polypeptide, saidfirst polypeptide or said second polypeptide. In some embodiments, thepeptide is linked to said polypeptide, said first polypeptide or saidsecond polypeptide at the amino-terminus of said polypeptide, said firstpolypeptide or said second polypeptide. In some embodiments, saidpeptide is linked to said polypeptide, said first polypeptide or saidsecond polypeptide at the carboxy-terminus of said polypeptide, saidfirst polypeptide or said second polypeptide. In some embodiments,polypeptide, first polypeptide, or second polypeptide is a thermostableprotein. In some embodiments, said thermostable protein is an enzyme. Insome embodiments, said enzyme is a polymerase, a reverse transcriptase,a nuclease, a pyrophosphatase, a protease, or a deaminase. In someembodiments, said fusion polypeptide is a polypeptide encoded by SEQ IDNO: 4. In some embodiments, said fusion polypeptide is at least 70%identical to a polypeptide encoded by SEQ ID NO: 4, SEQ ID NO: 5, or SEQID NO: 12. In some embodiments, said peptide is at least 70% identicalto SEQ ID NO: 1.

In some embodiments, provided herein are compositions comprising afusion polypeptide comprising a peptide tag linked to a firstpolypeptide, and a second polypeptide, wherein the peptide tagstabilizes said first polypeptide or said second polypeptide at atemperature between −20° C. and 50° C. In one embodiment, the peptidetag stabilizes the first polypeptide or the second polypeptide for atleast 1 day in a temperature between −20° C. and 50° C. In oneembodiment, the fusion polypeptide or the second polypeptide retainsenzymatic activity or hormone activity at a temperature between −20° C.and 50° C. In one embodiment, the enzymatic activity or hormone activityof the fusion polypeptide or the second polypeptide after exposure to atemperature between −20° C. and 50° C. is at least 50% of the enzymaticactivity or hormone activity of the fusion polypeptide or the secondpolypeptide prior to exposure to said temperature. In one embodiment,the first polypeptide or the second polypeptide is a polymerase, reversetranscriptase, nuclease, pyrophosphatase, deaminase, or protease. In oneembodiment, the first polypeptide or the second polypetide iserythropoietin, human Leukemia Inhibitor Factor (hLIF), granulocytemacrophage colony-stimulating factor (GM-CSF), insulin, vascularendothelial growth factor (VEGF), leptin, or bevacizumab. In oneembodiment, the first polypeptide or the second polypetide comprises atleast one mutation. In one embodiment, the fusion polypeptide comprisesa polypeptide that has an amino acid sequence that is at least 70%identical to SEQ ID NO: 1, SEQ ID NO: 6, SEQ ID NO: 8, SEQ ID NO: 11,SEQ ID NO: 13, SEQ ID NO: 14, SEQ ID NO: 16, SEQ ID NO: 18, SEQ ID NO:20, or SEQ ID NO: 22. In one embodiment, the fusion polypeptidecomprises a polypeptide encoded by SEQ ID NO: 3, SEQ ID NO: 5, SEQ IDNO: 7, SEQ ID NO: 12, SEQ ID NO: 15, SEQ ID NO: 17, SEQ ID NO: 19, SEQID NO: 21, or SEQ ID NO: 23.

In some embodiments, compositions provided herein further comprise athird polypeptide. In one embodiment, the fusion polypeptide has asequence as shown in SEQ ID NO: 2, the second polypeptide has a sequenceas shown in SEQ ID NO: 6, and the third polypeptide has a sequence asshown in SEQ ID NO: 11.

In some embodiments, said second polypeptide is a polymerase. In someembodiments, said second polypeptide is at least 70% identical to apolypeptide encoded by SEQ ID NO: 5 or SEQ ID NO: 12. In someembodiments, said second polypeptide is at least 70% identical to apolypeptide encoded by SEQ ID NO: 4. In some embodiments, said fusionpolypeptide comprises an enzyme or polymerase and said peptide has atleast 70% identity to a peptide encoded by SEQ ID NO: 3, SEQ ID NO: 7,or SEQ ID NO: 9. In some embodiments, said polypeptide, firstpolypeptide, or second polypeptide is selected from the group consistingof: DNA polymerase I, Thermus aquaticus DNA polymerase I (Taq), andThermococcus gorgonarius DNA polymerase (Tgo). In some embodiments, saidpolypeptide, first polypeptide, or second polypeptide is erythropoietin.In some embodiments, said polypeptide, first polypeptide or secondpolypeptide is a Taq polymerase. In some embodiments, said polypeptide,first polypeptide or second polypeptide is a Tgo polymerase, or 70%identical to Tgo polymerase. In some embodiments, said polypeptide,first polypeptide or second polypeptide is a Taq polymerase. In someembodiments, said polypeptide, first polypeptide, or second polypeptideis selected from the group consisting of: Thermoplasma acidophilumpyrophosphatase (TAPP), Pyrococcus horikoshii dCTP deaminase, cytidinedeaminase and a deoxycytidine deaminase. In some embodiments, thedeaminase is a RNA deaminase or a DNA deaminase. In some embodiments,said polypeptide, first polypeptide, or second polypeptide is anon-thermostable protein. In some embodiments, said non-thermostableprotein is human Leukemia Inhibitor Factor (hLIF) or leptin. In someembodiments, said temperature is about 20° C. to about 30° C.

In some embodiments, the exposure to the temperature is for at least 1week. In some embodiments, said enzymatic activity is greater than about50% of the activity of the enzyme prior to exposure to a temperature ofat least about −20° C. to about 35° C. In some embodiments, said peptidehas an amino acid sequence with at least 70%, 75%, 80%, 85%, 90%, or 95%identity to the sequence of SEQ ID NO: 1, SEQ ID NO: 8, SEQ ID NO: 10,or SEQ ID NO: 13. In some embodiments, said fusion polypeptide has anamino acid sequence that is at least 70% identical to SEQ ID NO: 2. Insome embodiments, the peptide is at least 70% identical to a peptideencoded by a nucleotide sequence that is SEQ ID NO: 3, SEQ ID NO: 7, orSEQ ID NO: 9. In some embodiments, the peptide-linked polypeptide is atleast 70% identical to a polypeptide encoded by a nucleotide sequencethat is SEQ ID NO: 5 or SEQ ID NO: 12. In some embodiments, thepeptide-linked polypeptide is at least 70% identical to a polypeptideencoded by a nucleotide sequence that is SEQ ID NO: 4, SEQ ID NO: 5, orSEQ ID NO: 12.

In yet another aspect, the disclosure provides a polypeptide, fusionpolypeptide, or composition comprising a peptide with an amino acidsequence that is at least 70% homologous to SEQ ID NO: 1, 8, or 13. Insome embodiments, the peptide is linked to a polypeptide. In someembodiments, the peptide is linked to the polypeptide through a covalentor non-covalent linkage. In some embodiments, the polypeptide is athermostable protein. In some embodiments, the thermostable protein isan enzyme. In some embodiments, the enzyme is a polymerase, a reveretranscriptase, a nuclease, a protease, a pyrophosphatase, or adeaminase. In some embodiments, the polymerase is DNA polymerase I,Thermus aquaticus DNA polymerase I (Taq), or Thermococcus gorgonariusDNA polymerase (Tgo). In some embodiments, the polymerase is a Taqpolymerase. In some embodiments, the pyrophosphatase is Thermoplasmaacidophilum pyrophosphatase (TAPP). In some embodiments, the deaminaseis Pyrococcus horikoshii dCTP deaminase. In some embodiments, thedeaminase is a cytidine deaminase or a deoxycytidine deaminase. In someembodiments, the deaminase is a RNA deaminase or a DNA deaminase. Insome embodiments, the polypeptide is a non-thermostable protein. In someembodiments, said polypeptide, first polypeptide or second polypeptideis Thermus thermophilics (Tth) DNA polymerase or ZO5 polymerase.

In some embodiments, the polypeptide, first polypeptide or secondpolypeptide is human Leukemia Inhibitor Factor (hLIF) or leptin. In someembodiments, the peptide-linked polypeptide retains an enzymaticactivity after exposure to a temperature of about −20° C. to about 35°C. In some embodiments, the polypeptide exhibits an enzymatic activityafter exposure to a temperature of about 20° C. to about 30° C. In someembodiments, the exposure to a temperature is for greater than 1 day. Insome embodiments, the enzymatic activity is greater than about 50% ofthe activity of the composition prior to the exposure to a temperatureof at least about −20° C. to about 35° C. In some embodiments, thepeptide is encoded by a nucleotide sequence that is at least 70%identical to SEQ ID NO: 3 or SEQ ID NO: 7.

In yet a further aspect, this disclosure provides a fusion polypeptidecomprising a first peptide that is at least 70% identical to a peptideencoded by SEQ ID NO: 3 and a second peptide that is at least 70%identical to a peptide encoded by SEQ ID NO: 9. In some embodiments,said first and second peptides are linked to a third peptide. In someembodiments, said first and second peptides are linked to each other. Insome embodiments, said linkage is covalent. In some embodiments, saidfirst peptide is linked to the N-terminus of a polypeptide and whereinsaid second peptide is linked to the C-terminus of said polypeptide.

In some embodiments, said second peptide is linked to the C-terminus ofsaid first peptide. In some embodiments, said fusion polypeptide had atleast 70% identity to a peptide encoded by SEQ ID NO: 7.

In yet another aspect, this disclosure provides a method of nucleic acidamplification comprising extending a nucleic acid primer with a mixturecomprising a polymerase, wherein the polymerase is linked to a peptidethat is at least 70% identical to a peptide encoded by SEQ ID NO: 3, SEQID NO: 7, to SEQ ID NO: 9, SEQ ID NO: 15, SEQ ID NO: 17 or SEQ ID NO:19. In some embodiments, the polymerase is linked at its N-terminus tothe peptide. In some embodiments, the polymerase is linked at itsC-terminus to the peptide. In some embodiments, the polymerase is a Taqpolymerase. In some embodiments, the polymerase exhibits an enzymaticactivity after exposure to a temperature between −20° C. and 50° C. Insome embodiments, the polymerase exhibits an enzymatic activity afterexposure to a temperature of about −20° C. to about 35° C. In someembodiments, the mixture further comprises a second polymerase. In someembodiments, said second polymerase is linked to a peptide sequence thatis at least 70% homologous to SEQ ID NO: 1, SEQ ID NO: 8, SEQ ID NO: 13,SEQ ID NO: 14, SEQ ID NO: 16, or SEQ ID NO: 18. In some embodiments, thepolymerase exhibits an enzymatic activity after exposure to atemperature of about 20° C. to about 30° C. for at least one day. Insome embodiments, the enzymatic activity is greater than about 50% ofthe activity of the composition prior to exposure to the temperature ofabout 20° C. to about 30° C. for at least one day.

In some embodiments, provided herein are methods of increasing stabilityof a polypeptide comprising providing a peptide tag that has an aminoacid sequence that is at least 70% identical to SEQ ID NO: 1.

In some embodiments, provided herein are use of a peptide tag toincrease stability of a polypeptide, wherein the peptide tag has anamino acid sequence that is at least 70% identical to SEQ ID NO: 1.

In some embodiments, the peptide tag has an amino acid sequence as shownin SEQ ID NO: 1, SEQ ID NO: 13, SEQ ID NO: 14, SEQ ID NO: 16, or SEQ IDNO: 18. In some embodiments, the peptide tag is encoded by a nucleicacid sequence as shown in SEQ ID NO: 3, SEQ ID NO: 15, SEQ ID NO: 17, orSEQ ID NO: 19. In some embodiments, the peptide tag comprises at leastone to six histidine residues. In some embodiments, the peptide tagcomprises a protease cleavage site. In some embodiments, the proteasecleavage site comprises the amino acid sequence DDDDK. In someembodiments, the peptide tag inhibits degradation or denaturation of thepolypeptide at a temperature between −20° C. and 50° C. In someembodiments, the peptide tag inhibits loss of protein function of thepolypeptide at a temperature between −20° C. and 50° C. In someembodiments, the protein function of the polypeptide after exposure tosaid temperature is at least 50% of the protein function of thepolypeptide prior to exposure to said temperature. In some embodiments,the peptide tag maintains stability of the polypeptide for at least oneday in a temperature between −20° C. and 50° C. In some embodiments, thepeptide tag linked to the polypeptide has an amino acid sequence asshown in SEQ ID NO: 6, SEQ ID NO: 8, SEQ ID NO: 11, SEQ ID NO: 20, orSEQ ID NO: 22.

In some embodiments, the peptide tag is linked to the polypeptide. Insome embodiments, the peptide tag is covalently linked to thepolypeptide. In some embodiments, the peptide tag is non-covalentlylinked to the polypeptide. In some embodiments, the peptide tag islinked to the amino-terminus of the polypeptide. In some embodiments,the peptide tag is linked to the carboxy-terminus of the polypeptide. Insome embodiments, the polypeptide is erythropoietin, human LeukemiaInhibitor Factor (hLIF), granulocyte macrophage colony-stimulatingfactor (GM-CSF), insulin, vascular endothelial growth factor (VEGF),leptin, or bevacizumab. In some embodiments, the polypeptide comprisesat least one mutation.

In some embodiments, the peptide tag is not linked to the polypeptide.

In some embodiments, the peptide tag linked to the polypeptide isencoded by a nucleic acid sequence as shown in SEQ ID NO: 5, SEQ ID NO:7, SEQ ID NO: 12, SEQ ID NO: 21, or SEQ ID NO: 23. In some embodiments,the polypeptide is a thermostable protein or enzyme. In someembodiments, the enzyme is a polymerase, reverse transcriptase,nuclease, pyrophosphatase, deaminase, or protease. In some embodiments,the polymerase is a DNA polymerase I, Thermus aquaticus DNA polymerase I(Taq), Thermococcus gorgonarius DNA polymerase (Tgo), Thermusthermophilics (Tth) DNA polymerase, or ZO5 DNA polymerase. In someembodiments, the pyrophosphatase is a Thermoplasma acidophilumpyrophosphatase (TAPP). In some embodiments, the deaminase is aPyrococcus horikoshii dCTP deaminase.

In some embodiments, provided herein are methods of increasing stabilityof a polypeptide, fusion polypeptide, or composition comprisingproviding a peptide tag that is 50% to 98% identical to SEQ ID NO: 1. Insome embodiments, provided herein are methods of increasing stability ofa polypeptide, fusion polypeptide, or composition comprising providing apolypeptide that is not SEQ ID NO: 2, SEQ ID NO: 6, or SEQ ID NO: 11. Insome embodiments, provided herein are methods of inhibiting loss ofenzymatic activity or hormone activity of a polypeptide, fusionpolypeptide, or composition comprising providing a peptide tag that is50% to 98% identical to SEQ ID NO: 1. In some embodiments, providedherein are methods of inhibiting loss of enzymatic activity or hormoneactivity of a polypeptide, fusion polypeptide, or composition comprisingproviding a polypeptide that is not SEQ ID NO: 2, SEQ ID NO: 6, or SEQID NO: 11. In some embodiments, provided herein are methods ofinhibiting degradation of a polypeptide, fusion polypeptide, orcomposition comprising providing a peptide tag that is 50% to 98%identical to SEQ ID NO: 1. In some embodiments, provided herein aremethods of inhibiting degradation of a polypeptide, fusion polypeptide,or composition comprising providing a polypeptide that is not SEQ ID NO:2, SEQ ID NO: 6, or SEQ ID NO: 11. In some embodiments, the peptide islinked to the polypeptide. In some embodiments, the polypeptide orcomposition further comprises a second polypeptide, wherein the peptidetag linked to the polypeptide increases stability of the secondpolypeptide. In some embodiments, the polypeptide or composition furthercomprises a third polypeptide, wherein the peptide tag linked to thepolypeptide increases stability of the second polypeptide or the thirdpolypeptide.

In some embodiments, provided herein are methods of increasing stabilityof a polypeptide comprising providing a peptide tag that has an aminoacid sequence that is at least 70% identical to SEQ ID NO: 1, whereinthe peptide tag is not SEQ ID NO: 1. In some embodiments, providedherein are use of a peptide tag to increase stability of a polypeptide,wherein the peptide tag has an amino acid sequence that is at least 70%identical to SEQ ID NO: 1, wherein the peptide tag is not SEQ ID NO: 1.In some embodiments, provided herein are methods of increasing stabilityof a polypeptide, fusion polypeptide, or composition, wherein thepolypeptide, fusion polypeptide, or composition is not SEQ ID NO: 1linked to a polypeptide. In some embodiments, provided herein are use ofa peptide tag to increase stability of a polypeptide, fusionpolypeptide, or composition comprising, wherein the polypeptide, fusionpolypeptide, or composition is not SEQ ID NO: 1 linked to a polypeptide.In some embodiments, provided herein are methods of increasing stabilityof a polypeptide, fusion polypeptide, or composition, wherein thepolypeptide, fusion polypeptide, or composition is not SEQ ID NO: 1linked to a Taq polymerase. In some embodiments, provided herein are useof a peptide tag to increase stability of a polypeptide, fusionpolypeptide, or composition comprising, wherein the polypeptide, fusionpolypeptide, or composition is not SEQ ID NO: 1 linked to a Taqpolymerase. In some embodiments, provided herein are methods ofincreasing stability of a polypeptide, fusion polypeptide, orcomposition, wherein the polypeptide, fusion polypeptide, or compositionis not SEQ ID NO: 1 linked to a Tgo polymerase. In some embodiments,provided herein are use of a peptide tag to increase stability of apolypeptide, fusion polypeptide, or composition comprising, wherein thepolypeptide, fusion polypeptide, or composition is not SEQ ID NO: 1linked to a Tgo polymerase.

In yet another aspect, this disclosure provides a nucleic acid vectorfor use in a bacterium comprising a eukaryotic translation initiationsequence upstream of a nucleic acid sequence encoding a polypeptidelinked to a peptide, wherein said polypeptide retains enzymatic activityat a temperature between about −20° C. to about 35° C., or 20° C. toabout 50° C. In some embodiments, said polypeptide is translated as botha short and long form. In some embodiments, the eukaryotic translationinitiation sequence at least partially encodes a polypeptide thatretains an enzymatic activity at a temperature between about −20° C. toabout 35° C. In some embodiments, the eukaryotic translation initiationsequence is a Kozak sequence (GCCGCCACCATGGTC). In some embodiments, theeukaryotic translation initiation sequence is upstream of a nucleic acidsequence encoding a polypeptide that is SEQ ID NOs: 1, 2, 6, 8, 10, 11or 13 or variants, fragments, or mutants thereof. In some embodiments,the composition comprises a bacterium comprising a nucleic acid vectordescribed herein.

In yet another aspect, this disclosure provides a composition comprisinga polypeptide linked to a peptide, wherein said polypeptide retains anenzymatic activity at a temperature between about −20° C. to about 35°C., wherein said polypeptide is encoded by a nucleic acid sequencehaving a eukaryotic translation initiation sequence. In someembodiments, the polypeptide is a thermostable protein. In someembodiments, the thermostable protein is an enzyme. In some embodiments,the enzyme is a polymerase, a pyrophosphatase, or a deaminase. In someembodiments, the polymerase is a DNA polymerase I, Thermus aquaticus DNApolymerase I (Taq), or Thermococcus gorgonarius DNA polymerase (Tgo). Insome embodiments, the polymerase is a Taq polymerase. In someembodiments, the polymerase is not Taq polymerase. In some embodiments,the pyrophosphatase is Thermoplasma acidophilum pyrophosphatase (TAPP).In some embodiments, the deaminase is Pyrococcus horikoshii dCTPdeaminase. In some embodiments, the deaminase is a cytidine deaminase ora deoxycytidine deaminase. In some embodiments, the deaminase is a RNAdeaminase or a DNA deaminase. In some embodiments, said eukaryotictranslation initiation sequence is a Kozak sequence (GCCGCCACCATGGTC).In some embodiments, said composition comprises both a short form and along form of said polypeptide. In some embodiments, said polypeptidelinked to a peptide is at least 70% identical to a polypeptide encodedby the nucleic acid sequence of SEQ ID NO: 4, SEQ ID NO: 5, or SEQ IDNO: 12. In some embodiments, said polypeptide is linked to a peptide atleast 70% identical to SEQ ID NO: 1, 3 or 10.

In some cases, when the enzyme (e.g., Taq polymerase, DNA deaminase, RNAdeaminase) is linked to the peptide (e.g., a peptide at least 70%identical to SEQ ID NO: 1) it exhibits at least 20%, 50%, 75%, 80%, 85%,90%, 95%, or 100% of its activity prior to short-term or long-termexposure to temperatures of from about −20° C. to about 35° C. In somecases, the exposure occurs for at least 1, 2, 3, 4, 5, 6, or 10 hours,at least 1, 2, 3, 4, 5, or 6 days, or at least 1, 2, 3, 4, 5, 6, or 10weeks, or at least 1, 2, 3, 4, 5, 6, or 10 months.

All publications, patents, and patent applications mentioned in thisspecification are herein incorporated by reference in their entirety tothe same extent as if each individual publication, patent, or patentapplication was specifically and individually indicated to beincorporated by reference.

BRIEF DESCRIPTION OF THE DRAWINGS

The novel features of the embodiments provided herein are set forth withparticularity in the appended claims. A better understanding of thefeatures and advantages of the embodiments provided herein will beobtained by reference to the following detailed description and drawingsthat set forth illustrative embodiments, in which the principles of theembodiments are utilized.

FIG. 1 depicts a qPCR amplification that is performed in order to testthe stability of a peptide-polypeptide fusion protein (SEQ ID NO: 2)after exposure to 35° C. The first sample of fresh qPCR mix is stored at−20° C. (top panel) and the second sample is stored at 35° C. for 4weeks (lower panel).

FIG. 2 depicts the amino acid sequence (SEQ ID NO: 1) of a 42 amino acidpeptide tag.

FIG. 3 depicts the amino acid sequence of a fusion polypeptide (SEQ IDNO: 2) consisting of the peptide tag of FIG. 2 (SEQ ID NO: 1) linked tothe N-terminus of wild-type Taq polymerase. The sequence of the 42 aminoacid peptide tag is underlined.

FIG. 4 depicts a nucleotide sequence (SEQ ID NO: 3) encoding the 42amino acid peptide tag (SEQ ID NO: 1).

FIG. 5 depicts the nucleotide sequence of a fusion polypeptide (SEQ IDNO: 2) consisting of the 42 amino acid (a.a.) peptide tag of FIG. 2 (SEQID NO: 1) linked to the N-terminus of wild type Taq polymerase. Theentire nucleotide sequence is designated SEQ ID NO: 4. The nucleotidesequence that encodes the 42 amino acid peptide tag is underlined.

FIG. 6 depicts the nucleotide sequence of a fusion polypeptide (SEQ IDNO: 6) consisting of the modified peptide tag fragment of FIG. 2 (SEQ IDNO: 1) (bold and underlined) linked to a peptide corresponding to afragment of Double-Stranded Binding protein (DSP) (underlined portion,not bolded), linked to the N-terminus of wild-type Taq polymerase. Theentire nucleotide sequence is designated SEQ ID NO: 5.

FIG. 7 depicts the amino acid sequence of a fusion polypeptide (SEQ IDNO: 6) consisting of the modified peptide tag fragment of FIG. 2 (SEQ IDNO: 1)(bold and underlined) linked to a peptide corresponding to afragment of Double-Stranded Binding protein (DSP) (underlined portion,not bolded), linked to the N-terminus of wild-type Taq polymerase.

FIG. 8 depicts the nucleotide sequence (SEQ ID NO: 7) encoding a peptidetag consisting of the modified peptide tag fragment of FIG. 2 (SEQ IDNO: 1)(bold and underlined) linked to a peptide corresponding to afragment of Double-Stranded Binding protein (DSP) (underlined portion,not bolded).

FIG. 9 depicts the amino acid sequence (SEQ ID NO: 8) of a tag peptideconsisting of the modified peptide tag fragment of FIG. 2 (SEQ ID NO:1)(bold and underlined) linked to a peptide corresponding to a fragmentof Double-Stranded Binding protein (DSP) (underlined portion, notbolded).

FIG. 10 depicts the nucleotide sequence (SEQ ID NO: 9) encoding a DSPtag peptide.

FIG. 11 depicts the amino acid sequence (SEQ ID NO: 10) of the DSP tag.

FIG. 12 depicts the amino acid sequence (SEQ ID NO: 11) of a fusionpolypeptide of the modified peptide tag fragment of FIG. 2 (SEQ ID NO:1)(bold and underlined) linked to a Tgo polymerase polypeptide linked toa DSP peptide (underlined portion, not bolded).

FIG. 13 depicts the nucleotide sequence (SEQ ID NO: 12) encoding a afusion polypeptide of the modified peptide tag fragment of FIG. 2 (SEQID NO: 1)(bold and underlined) linked to a Tgo polymerase polypeptide,which is linked to a DSP peptide (underlined portion, not bolded).

FIG. 14 depicts the amino acid sequence (SEQ ID NO: 13) of a fragment ofthe peptide of SEQ ID NO: 1.

FIG. 15 depicts an electrophoresis gel showing DNA amplification fromBarley genomic DNA using a variety of polymerases.

FIG. 16 depicts an amino acid sequence (SEQ ID NO: 14) of a modifiedfragment of the peptide of SEQ ID NO: 1.

FIG. 17 depicts a nucleotide sequence (SEQ ID NO: 15) encoding the 36amino acid peptide (SEQ ID NO: 14).

FIG. 18 depicts an amino acid sequence (SEQ ID NO: 16) of a modifiedfragment of the peptide of SEQ ID NO: 1.

FIG. 19 depicts a nucleotide sequence (SEQ ID NO: 17) encoding the 40amino acid peptide (SEQ ID NO: 16).

FIG. 20 depicts an amino acid sequence (SEQ ID NO: 18) of a modifiedfragment of the peptide of SEQ ID NO: 1.

FIG. 21 depicts a nucleotide sequence (SEQ ID NO: 19) encoding the 29amino acid peptide (SEQ ID NO: 18).

FIG. 22 depicts an amino acid sequence (SEQ ID NO: 20) of a modifiedfragment of the peptide of SEQ ID NO: 1 linked to a human erythropoietinpolypeptide.

FIG. 23 depicts a nucleotide sequence (SEQ ID NO: 21) encoding thepolypeptide of SEQ ID NO: 20.

FIG. 24 depicts an amino acid sequence (SEQ ID NO: 22) of a modifiedfragment of the peptide of SEQ ID NO: 1 linked to a human leukemiainhibitory factor.

FIG. 25 depicts a nucleotide sequence (SEQ ID NO: 23) encoding thepolypeptide of SEQ ID NO: 22.

FIG. 26 depicts an electrophoresis gel showing DNA amplification frommouse genomic DNA using Peptide tag-polymerase mixtures.

DETAILED DESCRIPTION OF THE INVENTION Overview

The present disclosure provides compositions and methods that enhancethe stability of proteins (e.g., thermostable enzymes, non-thermostableenzymes) following short-term or long-term exposure to a temperaturebetween −20° C. and +50° C. or from about −20° C. to +35° C. In someembodiments, the compositions are peptide tags or fusion proteinscomprising peptide tags. The proteins can be any type of protein. Thepeptide tags may aid the retention of protein structure, stability,enzymatic activity, binding activity, and any other property. In someembodiments, the proteins are nucleic acid binding proteins. In someembodiments, the fusion proteins demonstrate enhanced stability orenzymatic activity when compared to a similar protein that does not havethe tag, especially after short-term or long-term exposure to a certaintemperature (e.g., room temperature). Also disclosed herein are fusionpolypeptides that enhance the activity (e.g., sensitivity, yield,specificity) of other proteins, when the fusion polypeptides are mixedtogether with such proteins in a reaction sample. Also provided arevectors for the compositions described herein, kits, as well as methodsof using the compositions.

Peptide Tags

The compositions disclosed herein include peptides (e.g., a peptide withthe amino acid sequence of SEQ ID NO: 1 (FIG. 2), SEQ ID NO: 8 (FIG. 9),SEQ ID NO: 10 (FIG. 11), SEQ ID NO: 13 (FIG. 14), SEQ ID NO: 14 (FIG.16)) that enhance the stability of a polypeptide (e.g., enzyme, Taqpolymerase), and variants, mutants, and fragments thereof.

As used herein, enhancing or increasing stability of a polypeptiderefers to, for example, maintaining stability of the polypeptide,inhibiting degradation of the polypeptide, inhibiting denaturation ofthe polypeptide, inhibiting loss of protein activity (e.g., enzymatic orhormone activity) of the polypeptide, inhibiting aggregation of thepolypeptide, inhibiting crystallization of the polypeptide, inhibitingabsorption of the polypeptide, preserving the function of thepolypeptide, or preserving the primary, secondary, or tertiary structureof the polypeptide.

SEQ ID NO: 1 (FIG. 2) shows the amino acid sequence of a long form (42amino acids) of peptide tag described herein. SEQ ID NO: 13 (FIG. 14)discloses a 31 amino acid fragment of SEQ ID NO: 1, that can also beused as a peptide tag for the polypeptides, fusion polypeptides,compositions, and methods disclosed herein. SEQ ID NO: 14 (FIG. 16)discloses a 36 amino acid fragment of SEQ ID NO: 1, which can also beused as a peptide tag for the polypeptides, fusion polypeptides,compositions, and methods disclosed herein. SEQ ID NO: 1, SEQ ID NO: 13,and SEQ ID NO: 14 can be used singly, together, or in combination withother tags, in order to enhance the stability, binding affinity,enzymatic activity, yield, or other property of a polypeptide.

SEQ ID NO: 10 discloses the sequence of a fragment of a double-strandedDNA binding protein (DSP). The peptide of SEQ ID NO: 10 can also be usedin the compositions and methods described herein, either on its own, orwith the peptide tag of SEQ ID NO: 1, or other peptide tag describedherein. For example, FIG. 7 (SEQ ID NO: 6) provides an example of apolymerase linked to a fragment of SEQ ID NO; 1 and to a fragment of SEQID NO: 10. FIG. 12 (SEQ ID NO: 11) provides an example of a polymerase(here, tgo polymerase) that is linked both to a fragment of SEQ ID NO: 1and to a fragment of DSP, SEQ ID NO: 10, which is disclosed as theunbolded, underlined sequence in SEQ ID NO: 11 (FIG. 12).

The compositions also include peptides that are at least 50%, 60%, 65%,70%, 75%, 80%, 85%, 90%, or 95% identical (or homologous) to SEQ ID NO:1 (FIG. 2) SEQ ID NO: 8 (FIG. 9), SEQ ID NO: 10 (FIG. 11), SEQ ID NO: 13(FIG. 14), or SEQ ID NO: 14 (FIG. 16). Similarly, the compositionsfurther include peptides that are at least 50%, 60%, 65%, 70%, 75%, 80%,85%, 90%, or 95% identical to peptides encoded by SEQ ID NO: 3, SEQ IDNO: 7, SEQ ID NO: 9, or SEQ ID NO: 15.

In some embodiments, the peptide tag is limited to 50 amino acids. Insome embodiments, the peptide tag is limited to 55, 60, 65, 70, 75, 80,85, 90, 95, or 100 amino acids.

The percent sequence identity of two amino acid sequences are alignedusing a global alignment that takes account the entire length of thepeptide or polypeptide, as described by the Needleman-Wunsch-Sellersalgorithm (Needleman et al., (1970), J. Mol. Biol. 48:444; Sellers(1974), SIAM J. Appl. Math., 26:787. Illustrative parameters for FASTAanalysis are: ktup=1, gap opening penalty=10, gap extension penalty=1,and substitution matrix=BLOSUM62.

In some embodiments, peptide tags provided herein comprise a His peptidetag, wherein the His peptide tag comprises at least 1, 2, 3, 4, 5, 6, 7,8, 9, 10, 11, 12, 13, 14, 15, or 20 His residues. In some embodiments,peptide tags provided herein comprise a His peptide tag, wherein the Hispeptide tag comprises no more than 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11,12, 13, 14, 15, or 20 His residues. In some embodiments, the His peptidetag comprises from 1 to 5, 2 to 6, 3 to 7, 4 to 8, 5 to 9, 6 to 10, 7 to11, 8 to 12, 9 to 13, 10 to 14, 1 to 10, 2 to 11, 3 to 12, 4 to 13, 5 to14, 6 to 15, 7 to 16, 8 to 17, 9 to 19, 10 to 20, 1 to 20, 2 to 19, 3 to18, 4 to 17, 5 to 16, 6 to 15, 7 to 14, 8 to 13, 9 to 12, 10 to 11, 1 to6, 1 to 7, 1 to 8, 1 to 9, or 1 to 10 His residues.

In some embodiments, peptide tags provided herein comprise a sequencethat can be cleaved by a protease. In some embodiments, the peptide tagcomprises a protease cleavage site. Non-limiting examples of proteasesand associated cleavage residues (in parenthesis) include trypsin (Argor Lys), chymotrypsin (Trp, Tyr, Phe, Leu, Met, or His), endoproteinaseAsp-N (Asp), endoproteinase Arg-C (Arg), endoproteinase Glu-C (Glu),endoproteinase Lys-C (Lys), prolin-endopeptidase (Pro), pepsin (Phe,Tyr, Trp, or Leu), thermolysin (Ile, Leu, Val, Ala, Met, or Phe),thrombin (Arg) elastase (Ala or Val), papain (Leu or Gly), proteinase K(aromatic amino acids), subtilisin (His, Ser, Asp), and clostripain(Arg). In some embodiments, the peptide tags comprise a sequence thatcan be cleaved by a carboxypeptidase, carboxypeptidase A,carboxypeptidase B, carboxypeptidase P, carboxypeptidase Y, cathepsin C,acycloamino-acid-releasing enzyme, and pyroglutamate aminopeptidase. Insome embodiments, the peptide tags provided herein comprise a proteasecleavage site, wherein the peptide tag comprises at least 1, 2, 3, 4, 5,6, 7, 8, 9, 10, 11, 12, 13, 14, 15, or 20 protease cleavage sites. Insome embodiments, peptide tags provided herein comprise a proteasecleavage site, wherein the protease cleavage site comprises no more than1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, or 20 proteasecleavage sites. In some embodiments, the peptide tag comprises from 1 to5, 2 to 6, 3 to 7, 4 to 8, 5 to 9, 6 to 10, 7 to 11, 8 to 12, 9 to 13,10 to 14, 1 to 10, 2 to 11, 3 to 12, 4 to 13, 5 to 14, 6 to 15, 7 to 16,8 to 17, 9 to 19, 10 to 20, 1 to 20, 2 to 19, 3 to 18, 4 to 17, 5 to 16,6 to 15, 7 to 14, 8 to 13, 9 to 12, 10 to 11, 1 to 6, 1 to 7, 1 to 8, 1to 9, or 1 to 10 protease cleavage sites. In some cases the proteasecleavage site has the sequence: DDDDK. In some cases, the proteasecleavage site has at least four “D” residues. The protease cleavage sitemay be at least 50%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, or 95% identicalto the sequence DDDDK. In some cases, the peptide tag may comprise asequence that resembles a protease cleavage site, but that actually doesnot serve as a site of proteolytic cleavage. In some embodiments,peptide tags provided herein comprise a (Asp) D tag, wherein the peptidetag comprises at least 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14,15, or 20 Asp residues, e.g., DD, DDD, DDDD, etc. In some embodiments,the Asp tag comprises at least 4 Asp residues. In some embodiments,peptide tags provided herein comprise an Asp tag with no more than 1, 2,3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, or 20 Asp residues. In someembodiments, the Asp tag comprises from 1 to 5, 2 to 6, 3 to 7, 4 to 8,5 to 9, 6 to 10, 7 to 11, 8 to 12, 9 to 13, 10 to 14, 1 to 10, 2 to 11,3 to 12, 4 to 13, 5 to 14, 6 to 15, 7 to 16, 8 to 17, 9 to 19, 10 to 20,1 to 20, 2 to 19, 3 to 18, 4 to 17, 5 to 16, 6 to 15, 7 to 14, 8 to 13,9 to 12, 10 to 11, 1 to 6, 1 to 7, 1 to 8, 1 to 9, or 1 to 10 Aspresidues. In some embodiments, the peptide tag comprises a His tag (asdescribed herein) and an Asp tag. In some embodiments, one or more Aspresidues is substituted with another amino acid (e.g., one or more Gluresidues). In some embodiments, the His tag is substituted with one ormore amino acids (e.g., Lys or Arg).

All references to polypeptides, proteins and peptides, as used herein,refer to a polymer of amino acid residues. That is, a descriptiondirected to a polypeptide applies equally to a description of a peptideand a description of a protein, and vice versa. The terms apply tonaturally occurring amino acid polymers as well as amino acid polymersin which one or more amino acid residues is a non-naturally occurringamino acid, e.g., an amino acid analog. As used herein, the termsencompass amino acid chains of any length, including full lengthproteins (i.e., antigens), wherein the amino acid residues are linked bycovalent peptide bonds.

The term “amino acid” refers to naturally occurring and non-naturallyoccurring amino acids, as well as amino acid analogs and amino acidmimetics that function in a manner similar to the naturally occurringamino acids. Naturally encoded amino acids are the 20 common amino acids(alanine, arginine, asparagine, aspartic acid, cysteine, glutamine,glutamic acid, glycine, histidine, isoleucine, leucine, lysine,methionine, phenylalanine, proline, serine, threonine, tryptophan,tyrosine, and valine) and pyrolysine and selenocysteine. Amino acidanalogs refers to compounds that have the same basic chemical structureas a naturally occurring amino acid, i.e., an a carbon that is bound toa hydrogen, a carboxyl group, an amino group, and an R group, such as,homoserine, norleucine, methionine sulfoxide, methionine methylsulfonium. Such analogs have modified R groups (such as, norleucine) ormodified peptide backbones, but retain the same basic chemical structureas a naturally occurring amino acid.

As used herein, the term “unnatural amino acid” or “non-naturallyencoded amino acid” refers to any amino acid, modified amino acid,and/or amino acid analogue that is not one of the 20 common naturallyoccurring amino acids or selenocysteine or pyrrolysine. Other terms thatmay be used synonymously with the term “non-naturally encoded aminoacid” and “unnatural amino acid” are “non-natural amino acid,”“non-naturally-occurring amino acid,” and variously hyphenated andnon-hyphenated versions thereof. The term “non-naturally encoded aminoacid” also includes, but is not limited to, amino acids that occur bymodification (e.g. post-translational modifications) of a naturallyencoded amino acid (including but not limited to, the 20 common aminoacids or pyrrolysine and selenocysteine) but are not themselvesnaturally incorporated into a growing polypeptide chain by thetranslation complex. Examples of such non-naturally-occurring aminoacids include, but are not limited to, N-acetylglucosaminyl-L-serine,N-acetylglucosaminyl-L-threonine, O-phosphotyrosine, aminoadipic acid,beta-alanine, beta-aminopropionic acid, aminobutyric acid, piperidinicacid, aminocaprioic acid, aminoheptanoic acid, aminoisobutyric acid,aminopimelic acid, diaminobutyric acid, desmosine, diaminopimelic acid,diaminopropionic acid, N-ethylglycine, N-ethylasparagine, hyroxylysine,allo-hydroxylysine, hydroxyproline, isodesmosine, allo-isoleucine,N-methylglycine, sarcosine, N-methylisoleucine, N-methylvaline,norvaline, norleucine, orithine, 4-hydroxyproline,gamma-carboxyglutamate, epsilon-N,N,N-trimethyllysine,epsilon-N-acetyllysine, O-phosphoserine, N-acetylserine,N-formylmethionine, 3-methylhistidine, 5-hydroxylysine,sigma-N-methylarginine, and other similar amino acids and amino acids(e.g., 4-hydroxyproline).

The term “peptide” refers to a polymer composed of one to about 50 aminoacid residues related naturally occurring structural variants, andsynthetic non-naturally occurring analogs thereof linked via peptidebonds.

The term “polypeptide” refers to a polymer composed of at least about 50amino acid residues, related naturally occurring structural variants,and synthetic non-naturally occurring analogs thereof linked via peptidebonds. As used herein, polypeptides provided herein may be fusionpolypeptides or proteins.

The term “nucleic acid” refers to naturally occurring and non-naturallyoccurring nucleic acids, as well as nucleic acid analogs that functionin a manner similar to the naturally occurring nucleic acids. Thenucleic acids may be selected from RNA, DNA or nucleic acid analogmolecules, such as sugar- or backbone-modified ribonucleotides ordeoxyribonucleotides. It should be noted, however, that other nucleicanalogs, such as peptide nucleic acids (PNA) or locked nucleic acids(LNA), are also suitable. Examples of non-naturally occurring nucleicacids include: halogen-substituted bases, alkyl-substituted bases,hydroxy-substituted bases, and thiol-substituted bases, as well as5-propynyl-uracil, 2-thio-5-propynyl-uracil, 5-methylcytosine,isoguanine, isocytosine, pseudoisocytosine, 4-thiouracil, 2-thiouraciland 2-thiothymine, inosine, 2-aminopurine, N9-(2-amino-6-chloropurine),N9-(2,6-diaminopurine), hypoxanthine, N9-(7-deaza-guanine),N9-(7-deaza-8-aza-guanine) and N8-(7-deaza-8-aza-adenine),2-amino-6-“h”-purines, 6-amino-2-“h”-purines, 6-oxo-2-“h”-purines,2-oxo-4-“h”-pyrimidines, 2-oxo 6-“h”-purines, 4-oxo-2-“h”-pyrimidines.Those will form two hydrogen bond base pairs with non-thiolated andthiolated bases; respectively, 2,4 dioxo and 4-oxo-2-thioxo pyrimidines,2,4 dioxo and 2-oxo-4-thioxo pyrimidines, 4-amino-2-oxo and4-amino-2-thioxo pyrimidines, 6-oxo-2-amino and 6-thioxo-2-aminopurines, 2-amino-4-oxo and 2-amino-4-thioxo pyrimidines, and6-oxo-2-amino and 6-thioxo-2-amino purines.

The term “about,” as used herein, unless otherwise indicated, refers toa value that is no more than 10% above or below the value being modifiedby the term. For example, the term “about −20° C.” means a range of from−22° C. to −18° C. As another example, “about 1 hour” means a range offrom 54 minutes to 66 minutes.

Linkages

In some embodiments, peptide tags provided herein enhance the stabilityof a protein (or polypeptide) after being linked to the protein in somemanner (e.g., covalent or noncovalent linkage). In some cases, a peptide(e.g., the peptide of SEQ ID NO: 1, 8, 10, 13, or 14) is covalentlylinked to a polypeptide or enzyme (e.g., Taq, Tgo, TAPP, CDA, Pyrococcushorikoshii deaminase). The peptide may be linked to the N-terminus ofthe polypeptide or enzyme (e.g., Taq, Tgo, TAPP, CDA, Pyrococcushorikoshii deaminase). For example, a peptide that is at least 70%identical to a peptide encoded by SEQ ID NO: 3 may be linked to theN-terminus of Taq polymerase as depicted in FIG. 3. Similarly, a peptidethat is at least 70% identical to a peptide encoded by SEQ ID NO: 7 maybe linked to the N-terminus of Taq polymerase as depicted in FIG. 7. Inother cases, the peptide is linked to the C-terminus of a polypeptide orenzyme. For example, FIG. 13 depicts the nucleic acid sequence of Tgopolymerase that is linked at its C-terminus to a fragment of DSPpeptide.

In some cases, multiple peptide tags are linked to a polypeptidedescribed herein. A polypeptide can be linked to multiple copies of thesame peptide tag or to two or more different peptide tags. In someexamples, one peptide tag is linked to the N-terminus of thepolypeptide, while a second peptide tag is linked to the C-terminus ofthe polypeptide. For example, the polypeptide shown in FIG. 12 (SEQ IDNO: 11) includes a peptide tag (SEQ ID NO: 1) linked to the N-terminusof tgo polymerase and also a different peptide tag (DSP fragment)(underlined portion of SEQ ID NO: 11) fused to the C-terminus of the tgopolymerase. In still other examples, two or more (same or different)tags are linked in tandem to a polypeptide. For example, FIG. 7 (SEQ IDNO: 6) depicts a fragment of SEQ ID NO: 1 linked to the DSP peptide ofSEQ ID NO: 10 (FIG. 11), which is then linked to another fragment of SEQID NO: 1, which is linked to the N-terminus of Taq polymerase.

In some cases, peptide tags provided herein are directly linked to eachother and/or to the polypeptide. FIG. 7 shows an example of tagsdirectly linked to each other, and then directly linked to apolypeptide. In other cases, the tags are separated from each other by alinker, (e.g., peptide linker or other linker described herein). Thetags may also be linked to the polypeptide by a linker.

In some embodiments, a peptide is linked to the polypeptide or enzyme(e.g., polymerase) via genetic engineering. For example, a DNA constructis created that is capable of expressing a polypeptide comprising thepeptide (e.g., the peptide of SEQ ID NO: 1) fused to an enzyme (e.g.,Taq polymerase). One example of a portion of the nucleic acid sequenceof such construct is depicted in FIG. 5 (SEQ ID NO: 4). Another exampleis depicted in FIG. 6, SEQ ID NO: 5, and still another is depicted inFIG. 13 (SEQ ID NO: 12).

In some cases a polypeptide (e.g., polymerase, Taq polymerase, etc.) islinked to a peptide comprising a fragment (also referred to herein as a“portion”) of double stranded binding protein (DSP) (SEQ ID NO:6,underlined but not bolded portion), or variants, fragments, or mutantsthereof. In some cases the DSP fragment is linked to a peptide tag(e.g., to a peptide that is SEQ ID NO:1 or 13, or mutants or variantsthereof. In still other cases, the peptide (e.g., the peptide of SEQ IDNO: 1 or 13) is linked to a fragment of a polymerase, which is linked toa second polymerase.

In some embodiments, a peptide may also be linked to the enzyme througha linker, such as a peptide linker. The peptide sequence linker may be1, 2, 3, 4, 5, 6, 7, 8, 9, 10, or more than 10 amino acids in length.Examples are polypeptides that contain multiple aspartate or glutamateresidues. The sequence and length of an appropriate peptide can bedetermined by methods known in the art, for example by employing apeptide linker prediction software program to identify potentiallinkers. One example of such linker program is disclosed in George andHeringa, (2003), Protein Engineering, 15(11):871-879.

In still other cases, a peptide (e.g., the peptide of SEQ ID NO: 1) islinked to an enzyme via a non-covalent linkage. Examples of linkers thatmay be useful include: acid labile linkers, ester linkers, hydrazonelinkers, sulfonamide-containing linkers, enzymatically cleavablelinkers, or polymer based linkers. Polymer-based linkers, such aspolyethylene-glycol (PEG, Formula VI), are widely used to conjugate bothsmall molecule and large molecule drugs. When used to link a peptide toa therapeutic, the PEG conjugated polypeptides may offer a number ofdesirable advantages including higher solubility, less immunogenicity,improved half-life, targeted delivery and enhanced activity of thedrugs.

Many molecules with multiple reactive groups can serve as usefulcross-linking components and are commercially available from companieslike Sigma-Aldrich, or Pierce. Of particular utility are cross-linkingcomponents that are available in activated form and can be directly usedfor conjugation. Cross-linking components can comprise multiple reactivegroups with similar or identical chemical structure. Such reactivegroups can be simultaneously activated and coupled to multiple identicalnon-cross-linking components resulting in the direct formation ofhomomultimeric products. Examples for cross-linking components withmultiple similar reactive groups are citric acid, EDTA, TSAT. BranchedPEG molecules containing multiple identical reactive groups may also beuseful.

There are a large number of specific chemical products that work basedon the following small number of basic reaction schemes, all of whichare described in detail at www.piercenet.com. Examples of usefulcrosslinking agents are imidoesters, active halogens, maleimide, pyridyldisulfide, and NHS-esters. Homobifunctional crosslinking agents have twoidentical reactive groups and are often used in a one-step chemicalcrosslinking procedure. Examples are BS3 (a non-cleavable water-solubleDSS analog), BSOCOES (base-reversible), DMA (Dimethyl adipimidate-2HCl),DMP (Dimethyl pimelimidate-2HCl), DMS (Dimethyl suberimidate-2HCl), DSG(5-carbon analog of DSS), DSP (Lomant's reagent), DSS (non-cleavable),DST (cleavable by oxidizing agents), DTBP (Dimethyl3,3′-dithiobispropionimidate-2HCl), DTSSP, EGS, Sulfo-EGS, THPP, TSAT,DFDNB (1,5-Difluoro-2,4-dinitrobenzene) is especially useful forcrosslinking between small spacial distances (Kornblatt, J. A. and Lake,D. F. (1980). Cross-linking of cytochrome oxidase subunits withdifluorodinitrobenzene. Can J. Biochem. 58, 219-224).

Sulfhydryl-reactive homobifunctional crosslinking agents arehomobifunctional protein crosslinkers that react with sulfhydryls andare often based on maleimides, which react with —SH groups at pH6.5-7.5, forming stable thioether linkages. BM[PEO]3 is an 8-atompolyether spacer that reduces potential for conjugate precipitation insulfydryl-to-sulfhydryl cross-linking applications. BM[PEO]4 is similarbut with an 11-atom spacer. BMB is a non-cleavable crosslinker with afour-carbon spacer. BMDB makes a linkage that can be cleaved withperiodate. BMH is a widely used homobifunctional sulfhydryl-reactivecrosslinker. BMOE has an especially short linker. DPDPB and DTME arecleavable crosslinkers. HVBS does not have the hydrolysis potential ofmeleimides. TMEA is another option. Hetero-bifunctional crosslinkingagents have two different reactive groups. Examples are NHS-esters andamines/hydrazines via EDC activation, AEDP, ASBA (photoreactive,iodinatable), EDC (water-soluble carbodiimide). Amine-Sulfhydrylreactive bifunctional crosslinkers are AMAS, APDP, BMPS, EMCA, EMCS,GMBS, KMUA, LC-SMCC, LC-SPDP, MBS, SBAP, SIA (extra short), SIAB, SMCC,SMPB, SMPH, SMPT, SPDP, Sulfo-EMCS, Sulfo-GMBS, Sulfo-KMUS,Sulfo-LC-SMPT, Sulfo-LC-SPDP, Sulfo-MBS, Sulfo-SIAB, Sulfo-SMCC,Sulfo-SMPB. Amino-group reactive heterobifunctional crosslinking agentsare ANB-NOS, MSA, NHS-ASA, SADP, SAED, SAND, SANPAH, SASD, SFAD,Sulfo-HSAB, Sulfo-NHS-LC-ASA, Sulfo-SADP, Sulfo-SANPAH, TFCS.Arginine-reactive crosslinking agents are, for example APG, which reactsspecifically with arginines at pH 7-8.

Some Properties of the Peptide Tags

In some cases, the peptide enables an enzyme to exhibit at least 10%,15%, 20%, 25%, 30%, 35%, 40%, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%,85%, 90%, 95%, 96%, 97%, 98%, 99%, or 100% of the activity of the sameor similar enzyme that is not linked to the peptide. In some cases, thepeptide enables an enzyme to exhibit at least 10%, 20%, 30%, 40%, 50%,60%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, 99%, or 100% of itsenzymatic activity prior to long-term or short-term exposure to atemperature (e.g, room temperature, any temperature above −20° C.). Insome cases, the peptide enables a polypeptide to exhibit at least 10%,20%, 30%, 40%, 50%, 60%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%,99%, or 100% of its binding affinity compared to its binding affinityprior to long-term or short-term exposure to a temperature (e.g., roomtemperature, any temperature above −20° C.).

In some cases, a polypeptide fusion protein described herein can enhancethe activity of other polypeptides in a reaction mixture. For example,in some cases, a fusion polypeptide (e.g., a fusion polypeptide encodedby SEQ ID NO: 4, SEQ ID NO: 5, or SEQ ID NO: 12), enhances thesensitivity, specificity, fidelity, or yield of a reaction. For example,a fusion polypeptide with DNA polymerase activity (e.g., SEQ ID NO: 2,SEQ ID NO: 6, or SEQ ID NO: 11) can be added to a sample containing asecond (different) DNA polymerase (e.g., Taq polymerase, the Taq fusionof SEQ ID NO: 2, SEQ ID NO: 6, or SEQ ID NO: 11), and thereby enhancethe specificity, fidelity, sensitivity or yield of the second DNApolymerase. In some cases, the enhancement is more than 1%, 2%, 3%, 4%,5%, 10%, 15%, 20%, 25%, 30%, 35%, 40%, 45%, 50%, 55%, 60%, 65%, 70%,75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, 99%, 100%, 125%, 150%, 175%,200%, 250%, 300%, 350%, 400%, 450%, 500%, 600%, 700%, 800%, 900%, 1000%,2000%, 2500%, 3000%, 4000%, or 5000%. In some cases, the fusionpolypeptide also enhances the specificity, fidelity or yield of a thirdpolymerase, or of a reaction mix containing three or more polymerases.

In some embodiments, peptide tags described herein will enhance thestability, enzymatic activity, or other property of a fusion polypeptideafter short- or long-term exposure to a certain temperature (e.g., roomtemperature). For example, a polymerase (e.g., Taq polymerase) may losea substantial portion of its activity after exposure to room temperaturefor a period of a week or more, or even a day or more or three hours ormore.

In some embodiments, a composition disclosed herein (e.g., a peptidewith the amino acid sequence of SEQ ID NO: 1 (FIG. 2), of SEQ ID NO: 8(FIG. 9), or of SEQ ID NO: 10 or 13) may be linked to the enzyme orpolymerase (e.g., Taq polymerase, TGO polymerase) and thereby enable theenzyme or polymerase to retain activity after exposure to a temperature(e.g., room temperature) over time. In some cases, a peptide is linkedto a polymerase (e.g., Taq polymerase) and thereby enables thepolymerase to retain at least 10%, 15%, 20%, 25%, 30%, 35%, 40%, 45%,50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, 99%, or100% of its activity, even after long-term or short-term exposure to acertain temperature (e.g., room temperature of about 20° C. to 22° C.).In some cases, the peptide is linked to a polymerase or enzyme that isnot Taq polymerase.

In some embodiments, peptide tags described herein may also enhance theability of a polypeptide (e.g., polymerase) to bind to single-strandedDNA and/or double-stranded DNA. Often, such DNA-binding is nonspecific.In some cases, the enhancement is more than 1%, 2%, 3%, 4%, 5%, 10%,15%, 20%, 25%, 30%, 35%, 40%, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%,85%, 90%, 95%, 96%, 97%, 98%, 99%, 100%, 125%, 150%, 175%, 200%, 250%,300%, 350%, 400%, 450%, 500%, 600%, 700%, 800%, 900%, 1000%, 2000%2500%, 3000%, 4000%, or 5000%. The DSP peptide tag depicted in SEQ IDNO: 10 and fragments, variants, and mutants thereof, may be especiallysuited for enhancing the non-specific double- or single-strandedDNA-binding ability of a polypeptide.

In some embodiments, polypeptides, fusion polypeptides, or compositionsretain activity thereof at a temperature between −20° C. and 50° C. Insome embodiments, polypeptides, fusion polypeptides, or compositionsretain activity thereof at a temperature between −15° C. and 50° C.;between −10° C. and 50° C.; between −5° C. and 50° C.; between 0° C. and50° C.; between 5° C. and 50° C.; between 10° C. and 50° C.; between 15°C. and 50° C.; between 20° C. and 50° C.; between 20° C. and 45° C.;between 20° C. and 40° C.; between 20° C. and 35° C.; between 20° C. and30° C.; between 20° C. and 25° C.; between 20° C. and 22° C.; between15° C. and 25° C.; between 10° C. and 25° C.; between 5° C. and 25° C.;between 0° C. and 25° C.; between 0° C. and 30° C.; between 0° C. and35° C.; between 0° C. and 40° C.; between 0° C. and 45° C.; between 5°C. and 10° C.; between 5° C. and 15° C.; between 5° C. and 20° C.;between 5° C. and 25° C.; between 5° C. and 30° C.; between 5° C. and35° C.; between 5° C. and 40° C.; between 5° C. and 45° C.; between 10°C. and 15° C.; between 10° C. and 20° C.; between 10° C. and 25° C.;between 10° C. and 30° C.; between 10° C. and 35° C.; between 10° C. and40° C.; between 10° C. and 45° C.; between 15° C. and 20° C.; between15° C. and 30° C.; between 15° C. and 35° C.; between 15° C. and 40° C.;between 15° C. and 45° C.

In some cases, the fusion polypeptide (e.g., fusion protein of SEQ IDNO: 2, 6, or 11) is exposed to a temperature that is at least about −20°C., −19° C., −18° C., −17° C., −16° C., −15° C., −14° C., −13° C., −12°C., −11° C., −10° C., −9° C., −8° C., −7° C., −6° C., −5° C., −4° C.,−3° C., −2° C., −1° C., 0° C., 1° C., 2° C., 3° C., 4° C., 5° C., 6° C.,7° C., 8° C., 9° C., 10° C., 11° C., 12° C., 13° C., 14° C., 15° C., 16°C., 17° C., 18° C., 19° C., 20° C., 21° C., 22° C., 23° C., 24° C., 25°C., 26° C., 27° C., 28° C., 29° C., 30° C., 31° C., 32° C., 33° C., 34°C., 35° C., 36° C., 37° C., 38° C., 39° C., 40° C., 41° C., 42° C., 43°C., 44° C., 45° C., 46° C., 47° C., 48° C., 49° C., or 50° C. and thenretains a certain percentage of its stability, activity, sensitivity,fidelity, yield, or other property. The exposure to the temperature maybe short-term or long-term. The exposure to a temperature may be for atleast 1, 2, 3, 4, 5, 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, or 60minutes. The exposure to the temperature may be for at least 1, 2, 3, 4,5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23,or 24 hours, at least 1, 2, 3, 4, 5, or 6 days, or at least 1, 2, 3, 4,5, 6, 7, 8, 9, or 10 weeks, or for at least 1, 2, 3, 4, 5, 6, 7, 8, 9,10, 11, or 12 months, or for at least 1, 2, 3, 4, 5, 6, 7, 8, 9, or 10years. In some cases, the temperature is room temperature (e.g., about20° C. to 22° C.). In some cases, the polymerase is exposed to roomtemperature for at least 1, 2, 3, 4, 5, 6, 7, 8, 9, or 10 weeks. Forexample, the polymerase (e.g., Taq) may be exposed to room temperatureor at least 35° C. for at least 4 weeks, at least 6 weeks, at least 10weeks, or at least 15 weeks.

In one example, FIG. 1 depicts the activity of a Taq polymerase fused toa peptide either after storage at −20° C. or after storage at +35° C.for 4 weeks. The top panel shows the polymerase activity when thepolymerase is stored at −20° C.; the lower panel shows the polymeraseactivity of peptide-fusion polypeptide after being stored at 35° C. for4 weeks. As shown in FIG. 1, the Taq polymerase fusion enzyme exhibitssimilar activity after being stored under both conditions.

Polypeptides

In some embodiments, a composition described herein (e.g., the peptideof SEQ ID NO: 1, SEQ ID NO:8 or SEQ ID NO: 10 or SEQ ID NO: 13, orfragments, variants, or mutants thereof) may be linked to a variety ofpolypeptides, proteins, enzymes, or peptides.

In some embodiments, a peptide tag described herein may be linked to anyenzyme useful for a polymerase chain reaction (PCR), the method of K. B.Mullis, e.g., as described in U.S. Pat. Nos. 4,683,195 4,683,202, and4,965,188 and any other improved method known in the art. PCR is amethod for increasing the concentration of a segment of a targetsequence in a mixture of DNA without cloning or purification. Thisprocess for amplifying the target sequence typically consists ofintroducing a large excess of two oligonucleotide primers to the DNAmixture containing the desired target sequence, followed by a precisesequence of thermal cycling in the presence of a DNA polymerase. The twoprimers are complementary to their respective strands of the doublestranded target sequence. To effect amplification, the mixture isdenatured and the primers then annealed to their complementary sequenceswithin the target molecule. Following annealing, the primers areextended with a polymerase so as to form a new pair of complementarystrands. The steps of denaturation, primer annealing and polymeraseextension can be repeated many times (i.e., denaturation, annealing andextension constitute one cycle) to obtain a high concentration of anamplified segment of the desired target sequence.

In some embodiments, various polymerases may be linked to the peptidetags described herein. Such polymerases include Taq polymerase (usefule.g. in polymerase chain reaction (PCR) assays), DNA polymerase I(useful e.g. in nick translation and primer-extension assays), Klenowpolymerase (useful e.g. in random-primer labeling), Terminaldeoxynucleotidyl transferase (TdT) (useful e.g. for 3′-end labeling),Reverse transcriptase (e.g. for synthesizing DNA from RNA templates) orother polymerases such as SP6 RNA polymerase, T3 RNA polymerase and T7RNA polymerase for in vitro transcription.

In some embodiments, a peptide tag described herein may be linked to aDNA-dependent DNA polymerase, which is an enzyme that synthesizes acomplementary DNA copy from a DNA template by adding a nucleotide to the3′ end of a newly-forming strand. Some DNA polymerases also haveproof-reading ability, which is conferred by 3′ to 5′ exonucleaseactivity.

In some embodiments, DNA-dependent DNA polymerases provided herein maybe the naturally occurring enzymes isolated from bacteria orbacteriophages or expressed recombinantly, or may be modified or haveevolved forms which have been engineered to possess certain desirablecharacteristics, e.g., thermostability, or the ability to recognize orsynthesize a DNA strand from various modified templates. DNA-dependentDNA polymerases require a complementary primer to initiate synthesis. Itis known that under suitable conditions a DNA-dependent DNA polymerasemay synthesize a complementary DNA copy from an RNA template.RNA-dependent DNA polymerases (described herein) typically also haveDNA-dependent DNA polymerase activity.

Non-limiting examples of DNA polymerases include Thermus aquaticus (Taq)DNA polymerase, E. coli DNA polymerase I, Thermus thermophilus (Tth) DNApolymerase, Bacillus stearothermophilus DNA polymerase, Thermococcuslittoralis DNA polymerase, bacteriophage T7 DNA polymerase, Thermococcusgorgonarius (Tgo) polymerase, Pfu polymerase, Klenow fragment of E. coliDNA polymerase, Tma DNA polymerase, exo-Tli DNA polymerase, exo-KOD DNApolymerase, exo-JDF-3 DNA polymerase, exo-PGB-D DNA polymerase, U1Tma(N-truncated) Thermatoga martima DNA polymerase, or DNA polymerases frombacteriophages T4, Phi-29, M2, or T5.

In some embodiments, where desired, temperature stable polymerases maybe linked to a peptide tag disclosed herein. See, e.g., U.S. Pat. No.4,889,818 that discloses a representative thermostable enzyme isolatedfrom Thermus aquaticus. Additional representative temperature stablepolymerases include without limitation, e.g., polymerases extracted frombacteria such as Thermus aquaticus DNA polymerase I (Taq), Thermococcusgorgonarius (Tgo), Pyrococcus horikoshii, Pyrococcus furiosus,Pyrococcus woesei, Thermus filiformis, Thermus flavus, Thermus ruber,Thermus thermophilus, Bacillus stearothermophilus (which has a somewhatlower temperature optimum than the others listed), Thermus lacteus,Thermus rubens, Thermotoga maritima, Thermococcus littoralis, andMethanothermus fervidus.

In some cases, a peptide tag described herein is linked to athermostable enzyme that may or may not necessarily have polymeraseactivity (e.g., Thermoplasma acidophilum pyrophosphatase (TAPP),pyrophosphatase, dCTP deaminase (CDA), deoxycytidine deaminase, cytidinedeaminase, RNA deaminase, DNA deaminase). In some cases, a peptide tag(e.g., the peptide of SEQ ID NO: 1 or 8), is linked to a nonthermostablepolypeptide (e.g., human Leukemia Inhibitor Factor (hLIF), leptin).

In some embodiments, a peptide tag described herein may be linked topolymerases that exhibit strand-displacement activity (also known asrolling circle polymerization). Strand displacement can result in thesynthesis of tandem copies of a circular DNA template, and isparticularly useful in isothermal PCR reaction. Non-limiting examples ofsuitable rolling circle polymerases provided herein include but are notlimited to T5 DNA polymerase (Chatterjee et al., Gene 97:13-19 (1991)),and T4 DNA polymerase holoenzyme (Kaboord and Benkovic, Curr. Biol.5:149-157 (1995)), phage M2 DNA polymerase (Matsumoto et al., Gene84:247 (1989)), phage PRD1 DNA polymerase (Jung et al., Proc. Natl.Aced. Sci. USA 84:8287 (1987), and Zhu and Ito, Biochim. Biophys. Acta.1219:267-276 (1994)), Klenow fragment of DNA polymerase I (Jacobsen etal., Eur. J. Biochem. 45:623-627 (1974)).

One example of a class of rolling circle polymerases utilizes proteinpriming as a way of initiating replication. Exemplary polymerases ofthis class are modified and unmodified DNA polymerase, chosen or derivedfrom the phages ÿ29, PRD1, Cp-1, Cp-5, Cp-7, ÿ15, ÿ1, ÿ21, ÿ25, BS 32L17, PZE, PZA, Nf, M2Y (or M2), PR4, PR5, PR722, B103, SF5, GA-1, andrelated members of the Podoviridae family.

In some embodiments, a peptide tag described herein may be linked to aDNA-dependent RNA polymerase or transcriptase, which is an enzyme thatsynthesizes multiple RNA copies from a double-stranded orpartially-double-stranded DNA molecule having a promoter sequence thatis usually double-stranded. The RNA molecules are synthesized in the5′-to-3′ direction beginning at a specific position just downstream ofthe promoter. Examples of transcriptases are the DNA-dependent RNApolymerase from E. coli and bacteriophages T7, T3, and SP6.

In some embodiments, a peptide tag described herein may be linked aRNA-dependent DNA polymerase or reverse transcriptase (RT), which is anenzyme that synthesizes a complementary DNA copy from an RNA template.In this method, reverse transcription is coupled to PCR, e.g., asdescribed in U.S. Pat. No. 5,322,770. In RT-PCR, the RNA template isconverted to cDNA due to the reverse transcriptase activity of anenzyme, and then amplified using the polymerizing activity of the sameor a different enzyme. All known reverse transcriptases also have theability to make a complementary DNA copy from a DNA template; thus, theyare both RNA- and DNA-dependent DNA polymerases. RTs may also have anRNAse H activity. Both thermostable and thermolabile reversetranscriptase and polymerase can be used.

A common reverse transcriptase can be derived from Maloney murineleukemia virus (MMLV-RT). The peptide tags described herein may belinked to polypeptides having reverse transcriptase activity includingbut not limited to: Moloney Murine Leukemia Virus (M-MLV) reversetranscriptase, Rous Sarcoma Virus (RSV) reverse transcriptase, AvianMyeloblastosis Virus (AMV) reverse transcriptase, Rous-Associated Virus(RAV) reverse transcriptase, Myeloblastosis Associated Virus (MAV)reverse transcriptase, Human Immunodeficiency Virus (HIV) reversetranscriptase, Avian Sarcoma-Leukosis Virus (ASLV) reversetranscriptase, retroviral reverse transcriptase, retrotransposon reversetranscriptase, hepatitis B reverse transcriptase, cauliflower mosaicvirus reverse transcriptase, bacterial reverse transcriptase, Thermusthermophilus (Tth) DNA polymerase, Thermus aquaticus (Taq) DNApolymerase, Thermotoga neopolitana (Tne) DNA polymerase, Thermotogamaritima (Tma) DNA polymerase, Thermococcus litoralis (Tli or VENTR™)DNA polymerase, Pyrococcus furiosus (Pfu) DNA polymerase, DEEPVENT™,Pyrococcus species GB-D DNA polymerase, Pyrococcus woosii (Pwo) DNApolymerase, Bacillus sterothermophilus (Bst) DNA polymerase, Bacilluscaldophilus (Bca) DNA polymerase, Sulfoloblus acidocaldarius (Sac) DNApolymerase, Thermoplasma acidophilum (Tac) DNA polymerase, Thermusflavus (Tfl/Tub) DNA polymerase, Thermus ruber (Tru) DNA polymerase,Thermus brockianus (DYNAZYME™) DNA polymerase, Methanobacteriumthermoautotrophicum (Mth) DNA polymerase, and mutants, variants andderivatives thereof.

In some embodiments, a peptide tag described herein may be linked to anamylase. Non-limiting examples of amylases include those from Bacillusamyloliquefaciens, Bacillus licheniformis, Bacillus stearothermophilus,Bacillus subtilis, Lactobacillus manihotivorans, Myceliophthorathermophila, Pyrococcus furiosus, Pyrococcus woesei, Staphylothermusmarinus, Sulfolobus solfataricus, Thermococcus aggreganes, Thermococcusfumicolans, Thermococcus hydrothermalis, Thermomyces lanuginosas,Thermococcus profoundus, Bacillus ciculans, Bacillus cereus var.Mycoides, and Clostridium thermosulphurogenes.

In some embodiments, a peptide tag described herein may be linked to apullulanase. Non-limiting examples of pullulanases include those fromBacillus sp., Pyrococcus furiosus, Pyrococcus woesi, Thermococcusaggregans, Thermus caldophilus GK24, Thermococcus celer, Thermococcushydrothermalis, Thermococcus litoralis, and Thermotoga maritima MSB8.

In some embodiments, a peptide tag described herein may be linked to axylanse. Non-limiting examples of xylanases include those from Baccillusamyloliquefaciens, Bacillus circulars, Bacillus sp. Strain SPS-0,Bacillus subtilis, Clostridium abosum, Dictyoglomus sp. Strain B₁ ,Fusarium proliferatum, Pyrococcus furiosus, Scytalidium thermophilum,Streptomyces sp. Strain S38, Sulfolobus solfataricus, Teheromyceslanuginosus, Thermoasus aurantiacus, Thermotoga maritima MSB8,Thermotoga neapolitana, Thermotoga sp. Strain FjSS3-B1, and Thermotogathermarum.

In some embodiments, a peptide tag described herein may be linked to acellulase. Non-limiting examples of cellulases include those fromAnaerocellu thermophilum, Bacillus subtilis, Pyrococcus furiosus,Pyrococcus horicoshi, Rhodothermus marinus, Thermotoga maritema MSB8,and Thermotoga neapoltana (Endocellulase A or B).

In some embodiments, a peptide tag described herein may be linked to aproteolytic enzyme. Non-limiting examples of proteolytic enzymes includethose from Bacillus brevis, Bacillus licheniformis, Bacillusstearothermophilus, Bacillus sp. JB-99, Bacillus stearothermophilusTP26, Bacillus sp. No. AH-101, Bacillus thermoruber, Pyrococcus sp.KODI, Staphylothermus marinus, Thermoacidophiles, Thermococcusaggreganes, Thermococcus celer, Thermococcus litoralis, and Thermotogamaritema.

In some embodiments, a peptide tag described herein may be linked to alipase. Non-limiting examples of lipases include those from Bacillusacidocaldarius, Bacillus sp. RSJ-1, Bacillus strip J33, Bacillusstearothermophilus, Bacillus thermocatenletus, Bacillus thermoleovoransID-1, Geobacillus sp., Pseudomonas sp., Pyrobaculum calidifontis,Pyrococcus furiosus, and Pyrococcus horikoshii.

In some cases, one or more of the following polymerases are linked to apeptide tag encoded by SEQ ID NO: 3, by SEQ ID NO: 7, or by SEQ ID NO:9, or other peptide tag described herein: G46E E678G CS5 DNA polymerase,a G46E L329A E678G CS5 DNA polymerase, G46E E678G CS6 DNA polymerase, ΔZO5R DNA polymerase, ZO5 polymerase, E615G Taq DNA polymerase, Thermusflavus (Tfl) polymerase (e.g., a modified Tfl polymerase thatincorporates the T-terminator nucleotides described herein), Thermatogamaritime- or Tma-25 polymerase, Tma-30 polymerase, Thermus thermophilics(Tth) DNA polymerase, Pfu DNA polymerase, Pfx DNA polymerase, Thermusspecie SPS-17 polymerase, E615G Taq polymerase, Thermus ZO5R polymerase,T7 DNA polymerase, Kornberg DNA polymerase I or E. coli DNA PolymeraseI, Klenow DNA polymerase, Taq DNA polymerase, Micrococcal DNApolymerase, alpha DNA polymerase, reverse transcriptase, AMV reversetranscriptase, M-MuLV reverse transcriptase, DNA polymerase, RNApolymerase, E. coli RNA polymerase, SP6 RNA polymerase, T3 RNApolymerase, T4 DNA polymerase, T7 RNA polymerase, RNA polymerase II,terminal transferase, polynucleotide phosphorylase (PNP), ribonucleotideincorporating DNA polymerase, or the like. In some cases, aproof-reading enzyme is linked to a polypeptide encoded by SEQ ID NO: 3,or to a polypeptide encoded by SEQ ID NO: 7. Alternative, any polymerase(e.g., a polymerase described herein), may be linked to a fragment of apolypeptide encoded by SEQ ID NO: 7 or SEQ ID NO: 9, e.g., to doublestranded-binding protein (DSP).

In some embodiments, peptide tags (or structures) provided herein mayalso provide stability to polypeptides that are not polymerases. Thepeptide tags may aid the retention of any activity of a polypeptide,e.g., binding activity, enzymatic activity, especially when thepolypeptide is exposed to a temperature (e.g., room temperature) for acertain period of time. For example, the peptide tags described hereinmay be linked to erythropoietin (EPO) (also known as hematopoietin orhemopoietin), for instance, to enhance its stability at roomtemperature. EPO is a glycoprotein hormone that controls erythropoiesis,or red blood cell production. It is a cytokine for erythrocyte (redblood cell) precursors in the bone marrow. Purified forms of EPO can beused to treat diseases such as anemia or neurological diseases (e.g.,schizophrenia). Types of EPO available on the market include but are notlimited to erythropoietin (Epoeitin-Alpha™) and Darbepoietin-Alpha™.Trade names include, but are not limited to: Epogen™; Epoetin™,Procrit™, Eprex™, NeoRecormon™ Darbepoetin™, Epoetin Delta™, PDpoetin™,Aranesp™, and Methoxy polyethylene glycol-epoetin beta (Mircera™).

EPO is encoded by a single-copy gene which has five exons. The human andmouse EPO genes have 90% similar sequences immediately upstream of thetranscription start site, 80% in the coding regions, and 65% in thefirst intron. The locations of introns and splice donor and acceptorsites are conserved between human and mouse EPO genes. The mRNA for EPOcontains both 5′ and 3′ untranslated regions and codes for a leaderpeptide sequence and a predicted mature EPO protein of 166 amino acidsfor human and mouse, and 168 amino acids for monkey. The secreted formof human EPO, both the naturally occurring EPO recovered from urine(uh-EPOO or the recombinant EPO (rh-EPO) expressed in Chinese hamsterovary (CHO) cells, lacks the C-terminal arginine, which is removed bypostranslational cleavage. Mature human EPO protein comprises 165 aminoacids and has a molecular weight of 34 kDa, with glycosyl residuescontributing about 40% of the weight of the molecule. The EPO moleculecomprises four helices that interact via their hydrophobic domains toform a predominantly globular structure within an aqueous environment(Cheetham et al., 1998, Nat. Struct. Biol. 5:861-866). Human and murineEPO have four cysteines and monkey EPO has five. Internal disulfidebridges exist in human EPO, between Cys7 and Cys161, and between Cys29and Cys33. At least one of these disulfide bridges is important in thesecondary structure.

EPO initiates erythropoiesis by binding to the extracellular portion ofa preformed erythropoietin receptor (EPOR) homodimer (i.e., (EPOR)₂) ina manner that bridges between specific locations on the individual EPORsubunits. When EPO binds to the (EPOR)₂, large portions of the globularligand are remote from the binding regions and face outward, away fromthe complex of EPO and (EPOR)₂ into the aqueous medium. Human EPO hasfour glycosylation sites: a single 0-lined site at Ser126 and threeN-linked sites at Asn24, Asn38, and Asn83. The N-linked glycosylationsites are conserved in murine and monkey EPO. The oligosaccharide chainsof human EPO are fucose-containing, sialylated tetraantennaryoligosaccharides, some of which contain repeated N-acetyllactoseamines.The remaining N-linked oligosaccharides are triantennary and biantennaryoligosaccharides.

In some embodiments, polypeptides, fusion polypeptides, or compositionsprovided herein comprise EPO in its native form. In some embodiments,polypeptides, fusion polypeptides, or compositions comprise EPO with oneor more mutations. In some embodiments, polypeptides, fusionpolypeptides, or compositions comprise EPO with one or more mutations atthe four glycosylation sites: a single O-lined site at Ser126 and threeN-linked sites at Asn24, Asn38, and Asn83. In some embodiments,polypeptides, fusion polypeptides, or compositions comprise EPO with oneor more mutations at the four helices. In some embodiments,polypeptides, fusion polypeptides, or compositions comprise EPO with oneor more mutations at the residues that form the disulfide bridges: atCys7, Cys161, Cys29, and Cys33. In some embodiments, the EPO mutationsmay comprise conserved or non-conserved amino acid substitution,deletion, or addition.

In some embodiments, any protein therapeutic may be linked to a peptidetag disclosed herein. A summary of protein therapeutics can be found inLeader et al. (2008) Nature Review/Drug Discovery 7: 21-39. Examples ofprotein therapeutics include protein therapeutics with enzymatic orregulatory activity, protein therapeutics with special targetingactivity, protein vaccines, and protein diagnostics.

In some embodiments, polypeptides, fusion polypeptides, or compositionsprovided herein comprise a cosmetic peptide or polypetide. In someembodiments, the cosmetic peptide or polypeptide includes, but notlimited to, epidermal growth factor (EGF), keratinocyte growth factor(KGF), vascular endothelial growth factor (VEGF), fibroblast growthfactor (FGF), granulocyte-colony stimulation factor (G-CSF), growthdifferentiation factor 9 (GDF9), hepatocyte growth factor (HGF),hepatoma derived growth factor (HDGF), insulin-like growth factor (IGF),nerve growth factor (NGF), thrombopoietin, transforming growth factoralpha (TGF-α), transforming growth factor beta (TGF-β), placental growthfactor, human bone morphogenetic protein (BMP), BMP2, BMP7,platelet-derived growth factor (PDGF), collagenase, gelatinase, matrixmetalloproteinase-1, -2, -3, -7, -8, -9, -10, -11, -12, -13, -14, -15,-16, -17, -18, -19, -20, -21, -23A, -23B, -24, -25, -26, -27, or -28.

Examples of protein therapeutics with enzymatic or regulatory activity,include therapeutics for treating: endocrine disorders (e.g., insulin,Growth hormone (GH) somatotropin, Salmon calcitonin, human parthyroidhormone residues 1-34); haemostasis and thrombosis disorders (e.g.,Factor VIla, VIII, Factor IX, Antithrombin III, Protein C concentrate,tissue plasminogen activator (tPA), urokinase); metabolic deficienices(e.g., beta-gluco-cerebrosidase, alpha-L iduronidase); pulmonary andgastrointestinal disorders (e.g., alpha-1-proteinase inhibitor, lactase,pancreatic enzymes); immunodeficiency disorders (e.g., adenosinedeaminase, pooled immunoglobulins); blood disorders (e.g., Humanalbumin, erythropoietin, as described herein); fertility (human folliclestimulating hormone (FSH), Human chorionic gonadotropin (HCH),Lutropin-alpha); immunoregulation (e.g., interferon (IFN), granulocytemacrophage colony stimulating factor (GM-CSF), type I alpha-IFN,IFN-beta, IFN-gamma, IFN-gammalbeta, interleukin-1, interleukin-2,interleukin-3, interleukin-4, interleukin-5, interleukin-6,interleukin-7, interleukin-8, interleukin-9, interleukin-10,interleukin-11, interleukin-12, interleukin-13, interleukin-14,interleukin-15, interleukin-16, interleukin-17, interleukin-18,interleukin-19, interleukin-20, interleukin-21, interleukin-22,interleukin-23, interleukin-24, interleukin-25, interleukin-26,interleukin-27, interleukin-28, interleukin-29, interleukin-30,interleukin-31, interleukin-32, interleukin-33, interleukin-34,interleukin-35); growth regulation (e.g., vascular endothelial growthfactor (VEGF), epidermal growth factor (EGF), fibroblast growth factor(FGF), granulocyte-colony stimulation factor (G-CSF), growthdifferentiation factor 9 (GDF9), hepatocyte growth factor (HGF),hepatoma derived growth factor (HDGF), insulin-like growth factor (IGF),nerve growth factor (NGF), thrombopoietin, transforming growth factoralpha (TGF-α), transforming growth factor beta (TGF-0), placental growthfactor, human bone morphogenetic protein (BMP), BMP2, BMP7,platelet-derived growth factor (PDGF)). Other protein therapeuticsinclude proteolytic therapeutics (e.g., trypsin), Nesiritide; botulinumtoxin type A or B, collagenase, human deoxyribonuclease I, dornasealpha, hyaluronidase, papain, L-asparaginase, humanized antibodies(e.g., bevacizumab (Avastin™), rituximab, trastuzumab); enfuvirtide,abciximab, protein vaccines (e.g., HBsAg vaccine, HPV vaccine, OspA),and anti-rhesus IgG.

In some embodiments, protein diagnostics may also be linked to thepeptide tags described herein. Examples include but are not limited to:glucagon, growth hormone releasing hormone, imaging agents for cancerand other diseases, and HIV antigens and HCV antigens.

In some embodiments, polypeptides linked to the peptide tags describedherein may be fibrous proteins or globular proteins. Types of proteinsto which the peptide tags can be linked include, without limitation:Cytoskeletal proteins (e.g. actin, Arp2/3, Coronin, dystrophin, FtsZ %,keratin, myosin, Spectrin, Tau (protein), tubulin); extracellular matrixproteins (e.g., collagen, elastin, F-spondin, Pikachurin); plasmaprotein (e.g., serum albumin, Serum Amyloid P Component); coagulationfactors (e.g., complement proteins, C1-inhibitor, C3-convertase, FactorVIII, Factor IX, Factor XIII, Fibrin, protein C, Protein S, Protein Z,Protein Z-related protease inhibitor, thrombin, von Willebrand Factor);acute phase proteins (e.g., C-reactive protein); hemoproteins; celladhesion proteins (e.g., cadherin, integrin, NCAM, selectin);transmembrane transport proteins (e.g., CFTR, glychophorin D,scramblase); ion channels (e.g., acetylcholine receptor); G-proteincoupled receptors; potassium channels; synport/antiport proteins;hormones and growth factors (e.g., epidermal growth factor, insulin,insulin-like growth factor, oxytocin, follicle stimulating hormone,leutinizing hormone); transcription regulatory proteins (e.g., MyoD,C-myc); nutrient storage/transport proteins (e.g., ferritin);immunoglobulins; trypsin.

In some embodiments, the polypeptides linked to the peptide tagsdescribed herein may be nucleic acid binding peptides (e.g., a peptidecapable of binding any nucleic acid, including DNA, RNA, mRNA, cRNA,miRNA, siRNA, cDNA).

In some embodiments, a peptide provided herein is a restriction enzyme.Examples of restriction enzymes include AatII, Acc65I, AccI, AciI, AclI,AcuI, AfeI, AflII, AflIII, AgeI, AhdI, AleI, AluI, AlwI, AlwNI, ApaI,ApaLI, ApeKI, ApoI, AscI, AseI, AsiSI, AvaI, AvaII, AvrII, BaeGI, BaeI,BamHI, BanI, BanII, BbsI, BbvCI, BbvI, BccI, BceAI, BcgI, BciVI, BclI,BfaI, BfuAI, BfuCI, BglI, BglII, BlpI, BmgBI, BmrI, BmtI, BpmI, Bpu10I,BpuEI, BsaAI, BsaBI, BsaHI, BsaI, BsaJI, BsaWI, BsaXI, BseRI, BseYI,BsgI, BsiEI, BsiHKAI, BsiWI, BslI, BsmAI, BsmBI, BsmFI, BsmI, BsoBI,Bsp1286I, BspCNI, BspDI, BspEI, BspHI, BspMI, BspQI, BsrBI, BsrDI,BsrFI, BsrGI, BsrI, BssHII, BssKI, BssSI, BstAPI, BstBI, BstEII, BstNI,BstUI, BstXI, BstYI, BstZ17I, Bsu36I, BtgI, BtgZI, BtsCI, BtsI, Cac8I,ClaI, CspCI, CviAII, CviKI-1, CviQI, DdeI, DpnI, DpnII, DraI, DraIII,DrdI, EaeI, EagI, EarI, EciI, Eco53kI, EcoNI, EcoO109I, EcoP15I, EcoRI,EcoRV, FatI, FauI, Fnu4HI, FokI, FseI, FspI, HaeII, HaeIII, HgaI, HhaI,HincII, HindIII, HinfI, HinP1I, HpaI, HpaII, HphI, Hpy166II, Hpy1881,Hpy188III, Hpy99I, HpyAV, HpyCH4III, HpyCH4IV, HpyCH4V, KasI, KpnI,MboI, MboII, MfeI, MluI, MlyI, MmeI, MnlI, MscI, MseI, MslI, MspA1I,MspI, MwoI, NaeI, NarI, Nb.BbvCI, Nb.BsmI, Nb.BsrDI, Nb.BtsI, NciI,NcoI, NdeI, NgoMIV, NheI, NlaIII, NlaIV, NmeAIII, NotI, NruI, NsiI,NspI, Nt.AlwI, Nt.BbvCI, Nt.BsmAI, Nt.BspQI, Nt.BstNBI, Nt.CviPII, Pad,PaeR7I, PciI, PflFI, PflMI, PhoI, PleI, PmeI, PmlI, PpuMI, PshAI, PsiI,PspGI, PspOMI, PspXI, PstI, PvuI, PvulI, RsaI, RsrlI, SacI, SacII, SalI,SapI, Sau3AI, Sau96I, SbfI, ScaI, ScrFI, SexAI, SfaNI, SfcI, SfiI, SfoI,SgrAI, SmaI, SmlI, SnaBI, SpeI, SphI, SspI, StuI, StyD4I, StyI, SwaI, T,TaqαI, TfiI, TliI, TseI, Tsp45I, Tsp509I, TspMI, TspRI, Tth111I, XbaI,XcmI, XhoI, XmaI, XmnI, and ZraI.

In some embodiments, the peptide may also be a homing endonuclease.Examples of homing endonucleases include I-CeuI, I-SceI, PI-PspI, andPI-SceI. The peptide may also be a nicking endonuclease. Examples ofnicking endonucleases include Nb.BbvCI, Nb.BsmI, Nb.BsrDI, Nb.BtsI,Nt.AlwI, Nt.BbvCI, Nt.BsmAI, Nt.BspQI, Nt.BstNBI, and Nt.CviPII. In someembodiments, the peptide is high-fidelity. The peptide can be ahigh-fidelity variant of any peptide described herein.

In some embodiments, the peptide is a Cell nuclease, a mung beannuclease, a P1 nuclease, an S1 nuclease. In some embodiments, thepeptide is a single-strand specific (sss) nucleases. In someembodiments, the peptide is a nuclease useful for mutational analysisand/or single nucleotide polymorphism analysis. Cell nuclease may beused for mutational analysis and single-nucleotide polymorphism analysisto cleave single base pair mismatches in heteroduplex DNA templates—theTILLING (Targeting Induced Local Lesions IN Genomes) mismatch cleavagemethod. In other embodiments, the peptide is capable of fastdigestion.

In some embodiments, the peptide linked to a peptide tag describedherein can be a deoxyribonuclease, ribonuclease, exonuclease,endonuclease, exodeoxyribonuclease, exoribonuclease,endodeeoxyribonuclease, endoribonuclease, oligonuclease, RecBCD,deoxyribonuclease I, deoxyribonuclease II, deoxyribonuclease IV, UvrABCendonuclease, aspergillus nuclease S1, or micrococcal nuclease.

Modified Polypeptides

In some embodiments, polypeptides described herein may be modified inany manner known in the art. For example, the polymerases describedherein may be modified for us in Hot Start PCR methods. “Hot Start PCR”is a modified form of conventional polymerase chain reaction (PCR). Thepolymerases disclosed herein may also be modified to be used in HotStart PCR methods. Hot Start PCR typically involves the use of apolymerase that is inactivated at lower and ambient temperatures, andthat is subsequently activated at higher temperatures, usually duringthe denaturation step of PCR (e.g., when the reaction reaches atemperature 90 to 105° C., e.g., 95° C.). In some examples, the samplemust be incubated for a certain period of time (e.g., more than 1, 5, 7,10, 15, 20, or 30 minutes) at a specific temperature (e.g., about 85°C., 90° C., 95° C., 100° C., 105° C. or 110° C. For example, thereaction may be incubated for 15 min at 95° C. in order to activate aHot Start PCR polymerase. The use of such a polymerase preventsextension of non-specifically annealed primers and primer-dimers formedat low temperatures during PCR setup. A Hot Start PCR technique isespecially useful for avoiding non-specific amplification of DNA, andincreasing sensitivity and yield.

The inhibition of the polymerase used for Hot Start PCR is caused eitherby an antibody, peptide, or chemical modification. The modification isusually made at active site side chains (e.g. ABgene Thermostart). Oneexample of a chemically-modified polymerase useful for Hot Start PCR isa polymerase modified with an aldehyde modifying reagents, preferablyformaldehyde (see, e.g., U.S. Pat. No. 6,183,998). Other examplesinclude polymerases modified via other chemical reactions, such as by ananhydride reaction, and other modifications described in U.S. Pat. No.5,773,258. Polymerases useful for Hot Start PCR may also be modified bylinkage to a polymerase-specific antibody (see, e.g., U.S. Pat. No.5,338,671). In some cases, polymerases are sequestered from otherreagents in a reaction mix, with physical barriers before the thermalcycling takes place. For example, in a wax-barrier method, a wax such asparaffin wax or a paraffin wax bead is used to sequester the polymerasefrom other reagents in the reaction mix.

In some embodiments, the polymerases disclosed herein may be chemicallymodified in order to facilitate Hot Start PCR methods, by any methodknown in the art. (see, e.g., U.S. Pat. No. 5,773,258, U.S. Pat. No.6,183,998). In some cases, the polymerases are modified with an antibodyor peptide in order to facilitate Hot Start PCR methods. In still othercases, a polymerase described herein is sequestered in paraffin wax(e.g., paraffin wax bead).

The reagents necessary for performing Hot Start PCR are packaged in kitsthat are commercially available. This activation of the polymerase athigher temperatures, such as high temperatures useful for thedenaturation step of PCR

Polypeptide Variants

In some embodiments, polypeptides described in the present disclosure(e.g., polymerases) also include a vast number of sequence variations,mutants, and fragments thereof, that can be generated (e.g., in vitro)and screened for activity and stability. They also include any modifiedpolypeptides that are commercially available (e.g., titanium polymerase(Invitrogen); Taq Gold (Applied Biosystems), etc.). Taq polymerases thatare truncated often retain activity. Thus, the polypeptides describedherein include N′ and C′-terminal truncations of Taq polymerases.Indeed, they include any Taq polymerase that retains activity.

In order to isolate sequence variants, random mutagenesis of the entiresequence or specific subsequences corresponding to particular domainsmay be performed. Alternatively, site directed mutagenesis can beperformed reiteratively while avoiding mutations to residues criticalfor protease function. Mutation tolerance prediction programs can beused to greatly reduce the number of non-functional sequence variantsthat would be generated strictly by random mutagenesis. Various programsfor predicting the effects of amino acid substitutions in a proteinsequence on protein function (e.g., SIFT, PolyPhen, PANTHER PSEC, PMUT,and TopoSNP) are described in, e.g., Henikoff et al., (2006), Annu. Rev.Genomics Hum. Genet., 7:61-80.

In addition, the present disclosure provides different percentages ofsequence identity for the polypeptides described. Percent sequenceidentity is determined by conventional methods. See, for example,Altschul et al., (1986), Bull. Math. Bio., 48:603, and Henikoff andHenikoff, (1992), Proc. Natl. Acad. Sci. USA, 89:10915. Briefly, twoamino acid sequences are aligned to optimize the alignment scores usinga gap opening penalty of 10, a gap extension penalty of 1, and the“BLOSUM62” scoring matrix of Henikoff and Henikoff (supra). The percentidentity is then calculated as: ([Total number of identicalmatches]/[length of the longer sequence plus the number of gapsintroduced into the longer sequence in order to align the twosequences])(100).

There are many established algorithms available to align two amino acidsequences. The “FASTA” similarity search algorithm of Pearson and Lipmanis a suitable protein alignment method for examining the level ofidentity shared by an amino acid sequence disclosed herein and the aminoacid sequence of another peptide. The FASTA algorithm is described byPearson et al., (1988), Proc. Nat'l Acad. Sci. USA, 85:2444, and byPearson (1990), Meth. Enzymol. 183:63. Briefly, FASTA firstcharacterizes sequence similarity by identifying regions shared by thequery sequence (e.g., SEQ ID NO:4 or SEQ ID NO: 6 or SEQ ID NO:9) and atest sequence that have either the highest density of identities (if thektup variable is 1) or pairs of identities (if ktup=2), withoutconsidering conservative amino acid substitutions, insertions, ordeletions. The ten regions with the highest density of identities arethen rescored by comparing the similarity of all paired amino acidsusing an amino acid substitution matrix, and the ends of the regions are“trimmed” to include only those residues that contribute to the highestscore. If there are several regions with scores greater than the“cutoff” value (calculated by a predetermined formula based upon thelength of the sequence and the ktup value), then the trimmed initialregions are examined to determine whether the regions can be joined toform an approximate alignment with gaps. Finally, the highest scoringregions of the two amino acid sequences are aligned using a modificationof the Needleman-Wunsch-Sellers algorithm (Needleman et al., (1970), J.Mol. Biol. 48:444; Sellers (1974), SIAM J. Appl. Math., 26:787, whichallows for amino acid insertions and deletions. Illustrative parametersfor FASTA analysis are: ktup=1, gap opening penalty=10, gap extensionpenalty=1, and substitution matrix=BLOSUM62. These parameters can beintroduced into a FASTA program by modifying the scoring matrix file(“SMATRIX”), as explained in Appendix 2 of Pearson, (1990), Meth.Enzymol., 183:63.

Also provided herein are proteins having a conservative amino acidchange, compared with an amino acid sequence disclosed herein. Among thecommon amino acids, for example, a “conservative amino acidsubstitution” is illustrated by a substitution among amino acids withineach of the following groups: (1) glycine, alanine, valine, leucine, andisoleucine, (2) phenylalanine, tyrosine, and tryptophan, (3) serine andthreonine, (4) aspartate and glutamate, (5) glutamine and asparagine,and (6) lysine, arginine and histidine. The BLOSUM62 table is an aminoacid substitution matrix derived from about 2,000 local multiplealignments of protein sequence segments, representing highly conservedregions of more than 500 groups of related proteins See Henikoff et al.,(1992), Proc. Nat'l Acad. Sci., USA, 89:10915. Accordingly, the BLOSUM62substitution frequencies can be used to define conservative amino acidsubstitutions that may be introduced into the amino acid sequencesprovided herein. Although it is possible to design amino acidsubstitutions based solely upon chemical properties (as discussedabove), the language “conservative amino acid substitution” preferablyrefers to a substitution represented by a BLOSUM62 value of greater than−1. For example, an amino acid substitution is conservative if thesubstitution is characterized by a BLOSUM62 value of 0, 1, 2, or 3.According to this system, preferred conservative amino acidsubstitutions are characterized by a BLOSUM62 value of at least 1 (e.g.,1, 2 or 3), while more preferred conservative amino acid substitutionsare characterized by a BLOSUM62 value of at least 2 (e.g., 2 or 3).

It also will be understood that amino acid sequences may includeadditional residues, such as additional N- or C-terminal amino acids,and yet still be essentially as set forth in one of the sequencesdisclosed herein, so long as the sequence retains sufficient biologicalprotein activity to be functional in the compositions and methodsprovided herein.

In some cases, the composition comprises a polypeptide that is at least10%, 20%, 50%, 70%, 75%, 80%, 85%, 90%, 95%, or 100% identical to apolypeptide encoded by SEQ ID NO: 3, SEQ ID NO:4, SEQ ID NO:5, SEQ IDNO:7, SEQ ID NO:10, SEQ ID NO:12, fragments, mutants or variantsthereof.

Pharmaceutical Compositions

Compositions provided herein may be administered as pharmaceuticalformulations including those suitable for oral (including buccal andsub-lingual), rectal, intranasal, topical, transdermal, transdermalpatch, pulmonary, vaginal, suppository, or parenteral (includingintramuscular, intraarterial, intrathecal, intradermal, intraperitoneal,subcutaneous and intravenous) administration or in a form suitable foradministration by aerosolization, inhalation or insufflation. Generalinformation on drug delivery systems can be found in Ansel et al.,Pharmaceutical Dosage Forms and Drug Delivery Systems (LippencottWilliams & Wilkins, Baltimore Md. (1999).

In various embodiments, the pharmaceutical composition includes carriersand excipients (including but not limited to buffers, carbohydrates,mannitol, proteins, polypeptides or amino acids such as glycine,antioxidants, bacteriostats, chelating agents, suspending agents,thickening agents and/or preservatives), water, oils including those ofpetroleum, animal, vegetable or synthetic origin, such as peanut oil,soybean oil, mineral oil, sesame oil and the like, saline solutions,aqueous dextrose and glycerol solutions, flavoring agents, coloringagents, detackifiers and other acceptable additives, adjuvants, orbinders, other pharmaceutically acceptable auxiliary substances asrequired to approximate physiological conditions, such as pH bufferingagents, tonicity adjusting agents, emulsifying agents, wetting agentsand the like. Examples of excipients include starch, glucose, lactose,sucrose, gelatin, malt, rice, flour, chalk, silica gel, sodium stearate,glycerol monostearate, talc, sodium chloride, dried skim milk, glycerol,propylene, glycol, water, ethanol and the like. In some embodiments, thepharmaceutical preparation is substantially free of preservatives. Inother embodiments, the pharmaceutical preparation may contain at leastone preservative. General methodology on pharmaceutical dosage forms isfound in Ansel et al., Pharmaceutical Dosage Forms and Drug DeliverySystems (Lippencott Williams & Wilkins, Baltimore Md. (1999). It will berecognized that, while any suitable carrier known to those of ordinaryskill in the art may be employed to administer the compositions providedherein, the type of carrier will vary depending on the mode ofadministration. A thorough discussion of pharmaceutically acceptablecarriers/excipients can be found in Remington's Pharmaceutical Sciences,Gennaro, Ark., ed., 20th edition, 2000: Williams and Wilkins P A, USA.

Compounds may also be encapsulated within liposomes using well-knowntechnology. Biodegradable microspheres may also be employed as carriersfor the pharmaceutical compositions provided herein. Suitablebiodegradable microspheres are disclosed, for example, in U.S. Pat. Nos.4,897,268; 5,075,109; 5,928,647; 5,811,128; 5,820,883; 5,853,763;5,814,344 and 5,942,252.

The compound may be administered in liposomes or microspheres (ormicroparticles). Methods for preparing liposomes and microspheres foradministration to a patient are well known to those of skill in the art.U.S. Pat. No. 4,789,734, the contents of which are hereby incorporatedby reference, describes methods for encapsulating biological materialsin liposomes. Essentially, the material is dissolved in an aqueoussolution, the appropriate phospholipids and lipids added, along withsurfactants if required, and the material dialyzed or sonicated, asnecessary. A review of known methods is provided by G. Gregoriadis,Chapter 14, “Liposomes,” Drug Carriers in Biology and Medicine, pp.2.sup.87-341 (Academic Press, 1979).

Microspheres formed of polymers or proteins are well known to thoseskilled in the art, and can be tailored for passage through thegastrointestinal tract directly into the blood stream. Alternatively,the compound can be incorporated and the microspheres, or composite ofmicrospheres, implanted for slow release over a period of time rangingfrom days to months. See, for example, U.S. Pat. Nos. 4,906,474,4,925,673 and 3,625,214, and Jein, TIPS 19:155-157 (1998), the contentsof which are hereby incorporated by reference.

The concentration of drug may be adjusted, the pH of the solutionbuffered and the isotonicity adjusted to be compatible with intravenousinjection, as is well known in the art.

The compounds provided herein may be formulated as a sterile solution orsuspension, in suitable vehicles, well known in the art. Thepharmaceutical compositions may be sterilized by conventional,well-known sterilization techniques, or may be sterile filtered. Theresulting aqueous solutions may be packaged for use as is, orlyophilized, the lyophilized preparation being combined with a sterilesolution prior to administration. Suitable formulations and additionalcarriers are described in Remington “The Science and Practice ofPharmacy” (20th Ed., Lippincott Williams & Wilkins, Baltimore Md.), theteachings of which are incorporated by reference in their entiretyherein.

The agents or their pharmaceutically acceptable salts may be providedalone or in combination with one or more other agents or with one ormore other forms. For example a formulation may comprise one or moreagents in particular proportions, depending on the relative potencies ofeach agent and the intended indication. For example, in compositions fortargeting two different host targets and where potencies are similar,about a 1:1 ratio of agents may be used. The two forms may be formulatedtogether, in the same dosage unit e.g., in one cream, suppository,tablet, capsule, aerosol spray, or packet of powder to be dissolved in abeverage; or each form may be formulated in a separate unit, e.g., twocreams, two suppositories, two tablets, two capsules, a tablet and aliquid for dissolving the tablet, two aerosol sprays, or a packet ofpowder and a liquid for dissolving the powder, etc.

The term “pharmaceutically acceptable salt” means those salts whichretain the biological effectiveness and properties of the agentsprovided herein, and which are not biologically or otherwiseundesirable. For example, a pharmaceutically acceptable salt does notinterfere with the effect of an agent provided herein in preventing,reducing, or destabilizing the formation of a multi-subunit complex, orpromoting the disruption of a multi-subunit complex.

Typical salts are those of the inorganic ions, such as, for example,sodium, potassium, calcium, magnesium ions, and the like. Such saltsinclude salts with inorganic or organic acids, such as hydrochloricacid, hydrobromic acid, phosphoric acid, nitric acid, sulfuric acid,methanesulfonic acid, p toluenesulfonic acid, acetic acid, fumaric acid,succinic acid, lactic acid, mandelic acid, malic acid, citric acid,tartaric acid or maleic acid. In addition, if the agent(s) contain acarboxy group or other acidic group, it may be converted into apharmaceutically acceptable addition salt with inorganic or organicbases. Examples of suitable bases include sodium hydroxide, potassiumhydroxide, ammonia, cyclohexylamine, dicyclohexyl-amine, ethanolamine,diethanolamine, triethanolamine, and the like.

A pharmaceutically acceptable ester or amide refers to those whichretain biological effectiveness and properties of the agents providedherein, and which are not biologically or otherwise undesirable. Forexample, the ester or amide does not interfere with the beneficialeffect of an agent provided herein in preventing, reducing ordestabilizing assembly of the multi-subunit complex, or promotingdisruption or elimination of the multi-subunit complex in the cells, orpreventing or alleviating one or more signs or pathological symptomsassociated with exposure to one or more multi-subunit complexes orinsoluble components in a subject. Typical esters include ethyl, methyl,isobutyl, ethylene glycol, and the like. Typical amides includeunsubstituted amides, alkyl amides, dialkyl amides, and the like.

Aqueous compositions provided herein comprise an effective amount of acomposition of the present invention, which may be dissolved ordispersed in a pharmaceutically acceptable carrier or aqueous medium. Apharmaceutically acceptable carrier used herein may include any and allsolvents, dispersion media, coatings, antibacterial and antifungalagents, isotonic and absorption delaying agents and the like. The use ofsuch media and agents for pharmaceutically active substances is wellknown in the art. Except insofar as any conventional media or agent isincompatible with the active ingredient, its use in the therapeuticcompositions is contemplated. Supplementary active ingredients can alsobe incorporated into the compositions.

Exemplary pharmaceutically acceptable carriers for injectablecompositions can include calcium salts, for example, such as calciumchlorides, calcium bromides, calcium sulfates, and the like; and thesalts of organic acids such as acetates, propionates, malonates,benzoates, and the like. For example, compositions of the invention maybe provided in liquid form, and formulated in saline based aqueoussolution of varying pH (5-8), with or without detergents suchpolysorbate-80 at 0.01-1%, or carbohydrate additives, such mannitol,sorbitol, or trehalose. Commonly used buffers include histidine,acetate, phosphate, or citrate. Under ordinary conditions of storage anduse, these preparations can contain a preservative to prevent the growthof microorganisms. The prevention of the action of microorganisms can bebrought about by various antibacterial and antifungal agents, forexample, parabens, chlorobutanol; phenol, sorbic acid, thimerosal, andthe like. In many cases, it will be preferable to include isotonicagents, for example, sugars or sodium chloride. Prolonged absorption ofthe injectable compositions can be brought about by the use in thecompositions of agents delaying absorption, for example, aluminummonostearate, and gelatin.

For human administration, preparations meet sterility, pyrogenicity,general safety, and purity standards as required by FDA and otherregulatory agency standards. The active compounds will generally beformulated for parenteral administration, e.g., formulated for injectionvia the intravenous, intramuscular, subcutaneous, intralesional, orintraperitoneal routes. The preparation of an aqueous composition thatcontains an active component or ingredient will be known to those ofskill in the art in light of the present disclosure. Typically, suchcompositions can be prepared as injectables, either as liquid solutionsor suspensions; solid forms suitable for use in preparing solutions orsuspensions upon the addition of a liquid prior to injection can also beprepared; and the preparations can also be emulsified.

Sterile injectable solutions are prepared by incorporating the activecompounds in the required amount in the appropriate solvent with variousother ingredients enumerated above, as required, followed by filteredsterilization. Generally, dispersions are prepared by incorporating thevarious sterilized active ingredients into a sterile vehicle whichcontains the basic dispersion medium and the required other ingredientsfrom those enumerated above. In the case of sterile powders for thepreparation of sterile injectable solutions, methods of preparationinclude vacuum-drying and freeze-drying techniques which yield a powderof the active ingredient plus any additional desired ingredient from apreviously sterile-filtered solution thereof.

Upon formulation, solutions will be systemically administered in amanner compatible with the dosage formulation and in such amount as istherapeutically effective based on the criteria described herein. Theformulations are easily administered in a variety of dosage forms, suchas the type of injectable solutions described above, but drug releasecapsules and the like can also be employed

The appropriate quantity of a pharmaceutical composition to beadministered, the number of treatments, and unit dose will varyaccording to the subject to be treated, and the disease state of thesubject. The person responsible for administration will, in any event,determine the appropriate dose for the individual subject.

In addition to the compounds formulated for parenteral administration,such as intravenous or intramuscular injection, other alternativemethods of administration of the present invention may also be used,including but not limited to intradermal administration (See U.S. Pat.Nos. 5,997,501; 5,848,991; and 5,527,288), pulmonary administration (SeeU.S. Pat. Nos. 6,361,760; 6,060,069; and 6,041,775), buccaladministration (See U.S. Pat. Nos. 6,375,975; and 6,284,262),transdermal administration (See U.S. Pat. Nos. 6,348,210; and 6,322,808)and transmucosal administration (See U.S. Pat. No. 5,656,284). Suchmethods of administration are well known in the art. One may also useintranasal administration of the present invention, such as with nasalsolutions or sprays, aerosols or inhalants. Nasal solutions are usuallyaqueous solutions designed to be administered to the nasal passages indrops or sprays. Nasal solutions are prepared so that they are similarin many respects to nasal secretions. Thus, the aqueous nasal solutionsusually are isotonic and slightly buffered to maintain a pH of 5.5 to6.5. In addition, antimicrobial preservatives, similar to those used inophthalmic preparations and appropriate drug stabilizers, if required,may be included in the formulation. Various commercial nasalpreparations are known and include, for example, antibiotics andantihistamines and are used for asthma prophylaxis.

Additional formulations, which are suitable for other modes ofadministration, include suppositories and pessaries. A rectal pessary orsuppository may also be used. Suppositories are solid dosage forms ofvarious weights and shapes, usually medicated, for insertion into therectum or the urethra. After insertion, suppositories soften, melt ordissolve in the cavity fluids. For suppositories, traditional bindersand carriers generally include, for example, polyalkylene glycols ortriglycerides; such suppositories may be formed from mixtures containingthe active ingredient in any suitable range, e.g., in the range of 0.5%to 10%, preferably 1%-2%.

Oral formulations include such normally employed excipients as, forexample, pharmaceutical grades of mannitol, lactose, starch, magnesiumstearate, sodium saccharine, cellulose, magnesium carbonate and thelike. These compositions take the form of solutions, suspensions,tablets, pills, capsules, sustained release formulations, or powders. Incertain defined embodiments, oral pharmaceutical compositions willcomprise an inert diluent or assimilable edible carrier, or they may beenclosed in a hard or soft shell gelatin capsule, or they may becompressed into tablets, or they may be incorporated directly with thefood of the diet. For oral therapeutic administration, the activecompounds may be incorporated with excipients and used in the form ofingestible tablets, buccal tables, troches, capsules, elixirs,suspensions, syrups, wafers, and the like. Such compositions andpreparations can contain at least 0.1% of active compound. Thepercentage of the compositions and preparations may, of course, bevaried, and may conveniently be between about 2 to about 75% of theweight of the unit, or between about 25-60%. The amount of activecompounds in such therapeutically useful compositions is such that asuitable dosage will be obtained.

The tablets, troches, pills, capsules and the like may also contain thefollowing: a binder, such as gum tragacanth, acacia, cornstarch, orgelatin; excipients, such as dicalcium phosphate; a disintegratingagent, such as corn starch, potato starch, alginic acid and the like; alubricant, such as magnesium stearate; and a sweetening agent, such assucrose, lactose or saccharin may be added or a flavoring agent, such aspeppermint, oil of wintergreen, or cherry flavoring. When the dosageunit form is a capsule, it may contain, in addition to materials of theabove type, a liquid carrier. Various other materials may be present ascoatings or to otherwise modify the physical form of the dosage unit.For instance, tablets, pills, or capsules may be coated with shellac,sugar or both. A syrup of elixir may contain the active compoundssucrose as a sweetening agent, methylene and propyl parabens aspreservatives, a dye and flavoring, such as cherry or orange flavor. Insome embodiments, an oral pharmaceutical composition may be entericallycoated to protect the active ingredients from the environment of thestomach; enteric coating methods and formulations are well-known in theart.

Kits/Mixtures/Further Compositions

In some embodiments, the compositions disclosed herein may be directlyformulated into compositions (e.g., 5× solution concentation) to be usedin techniques requiring the use of a thermostable enzyme, such ascompositions for quantitative polymerase chain reactions (qPCR) (e.g.,real-time PCR, RT PCT, RT qPCR, probe qPCR, EvaGreen qPCR, HRM).

The kits disclosed herein may comprise a DNA-binding dye, particularlydyes that bind double-stranded DNA and emit a signal such as afluorescent signal. Nonlimiting examples of DNA binding dyes includeEvaGreen™, described in U.S. Pat. No. 7,601,498; LC Green; SYTO9;Chromofy; BEBO; and SYBR Green. Such dyes are particularly useful forquantitative PCR (qPCR) applications.

The kits may comprise a reference dye (e.g., ROX dye), or a quencher dye(e.g., TAMRA). In other cases, FRET may be used. FRET may also be usedfor the reference dye. In some cases, the FRET dye used for thereference dyes is composed of a fluorophore dye (e.g., FAM) followed bya nucleic acid sequence followed by a dye (e.g., Rox dye). Examples ofnucleic acid sequences include deoxynucleotides, such as repetitive dTelements. Examples include, 6, 7, 8, 9, 10, 11, 12, or more dTnucleotides. Other nucleotides (either repeated or mixes of differentnucleotides can be used). For example, repeated dA, dC, dG, or dUnucleotides can be used. Non-limiting examples of FRET dyes include thefollowing: 5′ FAM-TTTTTTTT-3′ROX (8dT); 5′FAM-TTTTTTTTT-3′ROX (9dT); or5′FAM-TTTTTTTTTT-3′ROX (10dT). The FRET phenomen can work at distancesfrom 1-5 nm, up to 10 nM. The T-T distance may be approximately 0.24 to0.36 nm. The distance between the reference and quencher dye can bebetween about 0.24 to 0.36 nm. The reference dye can have a nucleotidesequence that positions the FRET pairs at an appropriate distance fromeach other that allows for FRET to occur. In some embodiments, theexpected distance between FRET pairs can be less than about or about 2to 6 nanometers. In some embodiments, the distance between the FRETpairs can be about or up to about 2.72, 3.06, or 3.4 nm. The distancebetween the reference and quencher dye can be adjusted by selectingdifferent numbers of repeated nucleotides, e.g., 1, 2, 3, 4, 5, 6, 7, 8,9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, or 22 repeated A, C,T, or G nucleotides. Other reference and quencher dyes may be used,e.g., those described herein.

In some embodiments, the above mentioned reference dyes can be used in avariety of reactions, including reactions for real-time quantitativePCR. It can allow for real-time calculation of curves that account for apassive reference dye. The reference dye can be used in a variety ofmixes, including probe mixes, evagreen mixes, and HRM and evagreenmixes. The reference dye can be used at a single concentration acrossdifferent QPCR machines, e.g., an ABI 7900HT, an ABI7500, and aOneStepPlus. The concentration of reference dye may not need to beadjusted based on the machine in use. Without being limited to theory,the reference dye can be used at a single concentration across multiplemachines because the reference dye utilizes the FRET phenomenom.

In some embodiments, the reference dye can have any sequence with amelting temperature of about up to about 10, 11, 12, 13, 14, 15, 16, 17,18, 19, 20, 21, 22, or 23° C. The reference dye with repeatednucleotides, e.g., repeated thymine nucleotides, can be useful as areference dye because the nucleotide sequence has a low meltingtemperature. Low melting temperature may reduce the probability of thereference dye annealing to an undesired moiety. This can reduce theinteraction between double labeled oligo nucleotides and a DNA template.This can also reduce the formation of dimers.

The reference dyes can exhibit temperature and pH stability. Thereference dyes can retain about or greater than about 60, 70, 80, 90, or100% effectiveness after incubation at about, up to about, or greaterthan about −80, −60, −40, −30, −20, −10, 0, 10 20, 30, 40, 50, 60, 70,80, 90, or 100° C. for 1, 5, 10, 15, 20, 25, 30, 40, 50, 60, 70, 80, 90,or 100 hours. The reference dyes can retain 60, 70, 80, 90, or 100%effectiveness after incubation at about, up to about, or greater thanabout pH 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, or 14 for 1, 5, 10,15, 20, 25, 30, 40, 50, 60, 70, 80, 90, or 100 hours.

In some embodiments, the reference dyes can be resistant to bleaching.The reference dyes can retain about or greater than about 50, 60, 70,80, 90, 100% of its fluorescence emission capability after exposure toambient light for about or greater than about 1, 2, 5, 10, 30, 60, 90,120, 180, 240, or 360 minutes.

The kits may also comprise reaction media or buffers. Appropriatereaction media or buffers for kits comprising polymerases permit nucleicacid amplification according to the methods of the invention. Such mediaand conditions are known to persons of skill in the art, and aredescribed in various publications, such as U.S. Pat. Nos. 5,554,516;5,716,785; 5,130,238; 5,194,370; 6,090,591; 5,409,818; 5,554,517;5,169,766; 5,480,784; 5,399,491; 5,679,512; and PCT Pub. No. WO99/42618. For example, a buffer may be Tris buffer, although otherbuffers can also be used as long as the buffer components arenon-inhibitory to enzyme components of the methods of the invention. ThepH is from about 5 to about 11, but may also be from about 6 to about10, from about 7 to about 9, or from about 7.5 to about 8.5. More acidicand alkaline buffers may also be used.

In some embodiments, the reaction medium can also include bivalent metalions such as Mg2+ or Mn2+, at a final concentration of free ions that iswithin the range of from about 0.01 to about 15 mM, or from about 1 to10 mM. In some embodiments, the reaction medium comprises MgCl₂ (e.g.,greater than 1, 1.5, 2, 5, 7.5, 10, 15, 20, 25, 30, 40, or 50 mM ofMgCl2).

In some embodiments, the reaction medium can also include other salts,such as KCl or NaCl, that contribute to the total ionic strength of themedium. For example, the range of a salt such as KCl is preferably fromabout 0 to about 125 mM, more preferably from about 0 to about 100 mM,and most preferably from about 0 to about 75 mM. The reaction medium canfurther include additives that could affect performance of theamplification reactions, but that are not integral to the activity ofthe enzyme components of the methods. Such additives include proteinssuch as BSA, single strand binding proteins (for e.g., T4 gene 32protein), and non-ionic detergents such as NP40 or Triton. Reagents,such as DTT, that are capable of maintaining enzyme activities can alsobe included. Such reagents are known in the art.

In some embodiments, a buffer of the invention can include 50-80 mMTRIS, pH 8.3-9.0 and 10-20 mM (NH4)2SO4 or 30-50 mM KCl. The pH of thebuffer can be adjusted depending on the polymerase. In some embodiments,a higher pH, e.g., a pH about, less than about, or greater than about8.3, 8.4, 8.5, 8.6, 8.7, 8.8, or 8.9, can be used for a standardpolymerase and a lower pH, e.g., a pH about, less than about, or greaterthan about 8.3, 8.4, 8.5, 8.6, 8.7, 8.8, or 8.9, can be used for a hotstart polymerase.

In some embodiments, an Evagreen based real time PCR mix can contain 3polymerases. The mix can contain a main polymerase. The main polymerasecan have a concentration of about, greater than about, or less thanabout 5, 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85,90, 95, 96, 97, 98, 99, or 100%, which may be wt %, vol %, or mol %, ora percentage of the total polymerase in the mix on a molar, mass, orvolume basis. The mix can contain a polymerase with a double strandbinding domain integrated between a peptide tag of the invention and amain Taq amino acid sequence, e.g., a main polymerase with a doublestranding binding domain. The concentration of such a polymerase canhave a concentration of about, greater than about, or less than about 5,10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95,or 100%, which may be wt %, vol %, or mol %, or a percentage of thetotal polymerase in the mix on a molar, mass, or volume basis. The mixcan also contain a proof-reading polymerase (which may be Tgo based),where the peptide tag is at the N-terminus and the double strand bindingdomain is at the C-terminus. The concentration of such a polymerase canhave a concentration of about, greater than about, or less than about 1,2, 3, 4, 5, 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80,85, 90, 95, 96, 97, 98, 99, or 100%, which may be wt %, vol %, or mol %,or a percentage of the total polymerase in the mix on a molar, mass, orvolume basis. Such polymerases can have a peptide tag without influenceon the 3′ to 5′ exonuclease or proofreading activity of the polymerase.

In some embodiments, a probe mix of the invention can have twopolymerases. The main polymerase can be a polymerase with 5′ to 3′exonuclease activity. The polymerase can have a peptide tag that doesn'tinfluence the 5′ to 3′ exonuclease activity. The tag can be located atthe n-terminus. The probe mix can also include a polymerase with adouble strand binding domain. The double strand binding domain candecrease the 5′ to 3′ exonuclease activity. The reduction in activitycan be about, less than about, or greater than about 1, 5, 10, 15, 20,25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 96, 97, 98,99, or 100%.

In some embodiments, a typical master mix for end point PCR can contain3 polymerases, an additive (for example, BSA—bovine serum albumin), aDNA tracking dye (for example, Bromophenol blue), a DNA sample loadingcomponent (for example, glycerol). The 3 polymerases may have the aminoacid sequences set out in SEQ ID NO:2, SEQ ID NO:6 and SEQ ID NO:11,respectively. The mix can contain a main polymerase. The main polymerasecan have a concentration of about, greater than about, or less thanabout 5, 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85,90, 95, 96, 97, 98, 99, or 100%, which may be wt %, vol %, or mol %, ora percentage of the total polymerase in the mix on a molar, mass, orvolume basis. The mix can contain a polymerase with a double strandbinding domain integrated between a peptide tag of the invention and amain Taq amino acid sequence, e.g., a main polymerase with a doublestranding binding domain. The concentration of such a polymerase canhave a concentration of about, greater than about, or less than about 5,10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95,or 100%, which may be wt %, vol %, or mol %, or a percentage of thetotal polymerase in the mix on a molar, mass, or volume basis. The mixcan also contain a proof-reading polymerase (which may be Tgo based),where the peptide tag is at the N-terminus and the double strand bindingdomain is at the C-terminus. The concentration of such a polymerase canhave a concentration of about, greater than about, or less than about 1,2, 3, 4, 5, 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80,85, 90, 95, 96, 97, 98, 99, or 100%, which may be wt %, vol %, or mol %,or a percentage of the total polymerase in the mix on a molar, mass, orvolume basis. Such polymerases can have a peptide tag without influenceon the 3′ to 5′ exonuclease or proofreading activity of the polymerase.

In some embodiments, the buffers described herein can include a linearpolyacrylamide (LPA). The LPA may increase specificity and sensitivityof an enzyme. The LPA can be added to real-time mixes, includingreal-time mixes including either Evagreen or any probe, e.g., any probedescribed herein.

In some embodiments, a buffer can have a MgCl₂ concentration of 12.5 mMin a storage buffer and the reaction concentration can be 2.5 mM MgCl₂.In other embodiments, the reaction concentration of MgCl₂ can be between1.5 and 2.5 mM. The concentration of a DNA template can be 1 to 50ng/microliter.

In a reaction using an Evagreen dye, the final reaction concentration ofa primer (forward or reverse) can be between 80 and 250 nM. In areaction not using an Evagreen dye and including a probe, the finalconcentration of a primer can be 200-400 nM and the final concentrationof the probe can be 100 to 250 nM.

In a reaction using a proofreading enzyme, the MgCl2 concentration canbe 1.5 mM, the concentration of a primer (forward or reverse) can be 100to 300 nM, and the concentration of a template DNA can be 5-50ng/microliter.

Where appropriate, an RNase inhibitor (such as Rnasin) that does notinhibit the activity of the RNase employed in the method can also beincluded. Any aspect of the methods of the invention can occur at thesame or varying temperatures. Preferably, the amplification reactions(particularly, primer extension other than the first and second strandcDNA synthesis steps, and strand displacement) are performedisothermally, which avoids the cumbersome thermocycling process. Theamplification reaction is carried out at a temperature that permitshybridization of the primers to the template polynucleotide and primerextension products, and that does not substantially inhibit the activityof the enzymes employed. The temperature can be in the range of about25° C. to about 85° C., about 30° C. to about 80° C., or about 37° C. toabout 75° C.

The oligonucleotide components of the amplification reactions providedherein are generally in excess of the number of target nucleic acidsequence to be amplified. They can be provided at about or at leastabout any of the following: 10, 102, 104, 106, 108, 1010, 1012 times theamount of target nucleic acid.

In one embodiment, the foregoing components are added simultaneously atthe initiation of the amplification process. In another embodiment,components are added in any order prior to or after appropriatetimepoints during the amplification process, as required and/orpermitted by the amplification reaction. Such timepoints can be readilyidentified by a person of skill in the art. The enzymes used for nucleicacid amplification according to the methods of the invention can beadded to the reaction mixture either prior to the target nucleic aciddenaturation step, following the denaturation step, or followinghybridization of the primer to the target RNA or DNA, as determined bytheir thermal stability and/or other considerations known to the personof skill in the art.

The amplification process can be stopped at various timepoints, andresumed at a later time. Said timepoints can be readily identified by aperson of skill in the art.

In some embodiments, the compositions may also comprise dNTPs (e.g.,greater than 1, 1.5, 2, 5, 7.5, 10, 15, 20, 25, 30, 40, or 50 mM dNTPs).The dNTPs may be ultrapure dNTPs. The dNTPs may comprise dATP, dGTP,dCTP, dTTP, dUTP, or any combination thereof.). In come cases, thecomposition comprises a dye (e.g., blue or yellow dye). In some cases, abuffer comprises a detergent described herein. In some cases, the bufferdoes not comprise a detergent. In some cases, the buffer contains a highpH. In other embodiments, the buffer has a low or neutral pH. In somecases, the buffer contains (NH₄)2SO₄.

In some embodiments, the compositions provided herein may be provided ina solution. The solutions may be formulated at different concentrations.For example, the solution may be 1×, 2×, 3×, 4×, 5×, 10×, or greaterthan 15× concentration. Further descriptions of formulations of thecompositions are provided herein.

Some kits comprise two polymerases. For example, a kit may comprise apolymerase with 5′ to 3′ exonuclease activity (e.g., SEQ ID NO: 2) at aconcentration of about 10%, 20%, 50%, 75%, 80%, 85%, 90%, 95%, 96%, 97%,98%, 99%, 99.1%, 99.2%, 99.3%, 99.5%, 99.7%, 99.8%, 99.9%, or 100% ofthe total polymerase concentration. Such kit may comprise a secondpolymerase (e.g., a polymerase tagged with the peptide of SEQ ID NO: 10)at a concentration of 0.1%, 0.2%, 0.3%, 0.5%, 0.7%, 0.8%, 0.9%, 1%, 2%,3%, 4%, 5%, 10%, 15%, 20%, 50%, 75%, 80%, 85%, 90%, 95%, or 100% of thetotal polymerase concentration. In a preferred embodiment, a polymerasewith 5′ to 3′ exonuclease activity (e.g., SEQ ID NO: 2) (or anypolypeptide comprising the peptide tag of SEQ ID NO: 1) is present at aconcentration of greater than 85%, 90%, 95%, 96%, 97%, 98%, 99%, 99.1%,99.2%, 99.3%, 99.5%, 99.7%, 99.8%, 99.9% of the total concentration ofpolymerase, while the second polymerase is present at a concentrationless than of 0.1%, 0.2%, 0.3%, 0.5%, 0.7%, 0.8%, 0.9%, 1%, 2%, 3%, 4%,5%, 10%, 15%. In another preferred embodiment, a fusion polypeptidecomprising the peptide tag of SEQ ID NO: 1 is present at a concentrationof greater than 85%, 90%, 95%, 96%, 97%, 98%, 99%, 99.1%, 99.2%, 99.3%,99.5%, 99.7%, 99.8%, 99.9% of the total concentration of polymerase,while the second polypeptide is a polypeptide comprising either the DSPpeptide (SEQ ID NO: 10) or two peptide tags (e.g., SEQ ID NO: 8 or SEQID NO: 11 (showing two tags in one polymerase)) and is present at aconcentration less than of 0.1%, 0.2%, 0.3%, 0.5%, 0.7%, 0.8%, 0.9%, 1%,2%, 3%, 4%, 5%, 10%, 15%. In still other cases, a wild-type Taq (or anypolymerase) is present at 85%, 90%, 95%, 96%, 97%, 98%, 99%, 99.1%,99.2%, 99.3%, 99.5%, 99.7%, 99.8%, 99.9% of the total concentration ofpolymerase, while the second polypeptide is a polypeptide comprisingeither the DSP peptide (SEQ ID NO: 10) or two peptide tags (e.g., SEQ IDNO: 8) or is a polymerase with two peptide tags separated by thepolymerase (e.g., SEQ ID NO: 11) and is present at a concentration lessthan 0.1%, 0.2%, 0.3%, 0.5%, 0.7%, 0.8%, 0.9%, 1%, 2%, 3%, 4%, 5%, 10%,15%.

In certain embodiments, some kits may comprise three or morepolymerases. For example, a kit may comprise a polymerase (e.g., SEQ IDNO: 2 or any polypeptide comprising the peptide tag of SEQ ID NO: 1) ata concentration of about 10%, 20%, 50%, 75%, 80%, 85%, 90%, 95%, 96%,97%, 98%, 99%, 99.1%, 99.2%, 99.3%, 99.5%, 99.7%, 99.8%, 99.9%, or 100%of the total polymerase concentration. In addition, such kit maycomprise a second polymerase such as a polymerase comprising the peptidetag of SEQ ID NO: 8, 10 or 13. And furthermore, such kit may comprise athird polypeptide such as a proof-reading polymerase (e.g., tgopolymerase). Such proof-reading polymerase may comprise the peptide tagof SEQ ID NO: 1, 8, 10, or 13. For example, such proof-readingpolymerase may be SEQ ID NO: 11 (FIG. 12). The second and thirdpolypeptides may each be present at a concentration of less than 0.1%,0.2%, 0.3%, 0.5%, 0.7%, 0.8%, 0.9%, 1%, 2%, 3%, 4%, 5%, 10%, 15%.

In some embodiments, the compositions may also be used in amplificationsinvolving the use of thermostable DNA polymerases such as Taq or Tgo DNApolymerases, or mutants, derivatives or fragments thereof. For example,in some cases, a polypeptide encoded by SEQ ID NO: 5 or SEQ ID NO: 4 orSEQ ID NO: 12 is used in combination with a polymerase (e.g., Taqpolymerase) or proof-reading polymerase (e.g., Tgo DNA polymerase) in anamplification. In some cases, the quantity of a polypeptide encoded bySEQ ID NO: 5, or variants, fragments, or mutants thereof, or SEQ ID NO:4 (or SEQ ID NO: 12), or variants, fragments or mutants thereof, is atleast 1-, 2-, 3-, 4-, 5-, 10-, 20-, 25-, 30-, 40-, 50-, 60-, 70-, 80-,100-, 250-, 500-, 1000-, 5000-, 10,000-, 15,000-, 20,000-, 50,000-,100,000-, 500,000-, 10⁶, 10⁷, 10⁸, or 10⁹ less than the quantity of apolymerase (e.g., Taq or Tgo DNA polymerase) used in the amplification.Therefore, in some cases, a composition disclosed herein is a mixture ofa polypetide encoded by SEQ ID NO: 5, or variants, fragments, or mutantsthereof, or a polypeptide encoded by SEQ ID NO: 4, or variants,fragments, or mutants thereof, and a polymerase (e.g., Taq or Tgo DNApolymerase), wherein the quantity of a polypetide encoded by SEQ ID NO:5, or variants, mutants or fragments thereof, or SEQ ID NO: 4, orvariants, mutants or fragments thereof, is at least 1-, 2-, 3-, 4-, 5-,10-, 20-, 25-, 30-, 40-, 50-, 60-, 70-, 80-, 100-, 250-, 500-, 1000-,5000-, 10,000-, 15,000-, 20,000-, 50,000-, 100,000-, 500,000-, 10⁶, 10⁷,10⁸, or 10⁹ fold less than the quantity of the polymerase (e.g., Taq orTgo DNA polymerase). In certain preferred examples, the composition is amixture of a polypeptide encoded by SEQ ID NO: 4, or variants, mutantsor fragments thereof, and a polypeptide encoded by SEQ ID NO: 5, orvariants, mutants, or fragments thereof wherein the quantity ofpolypeptide encoded by SEQ ID NO: 5, or variants thereof, in the mixtureis at least 1-, 2-, 3-, 4-, 5-, 10-, 20-, 25-, 30-, 40-, 50-, 60-, 70-,80-, 100-, 250-, 500-, 1000-, 5000-, 10,000-, 15,000-, 20,000-, 50,000-,100,000-, 500,000-, 10⁶, 10⁷, 10⁸, or 10⁹ fold less than the quantity ofpolypeptide encoded by SEQ ID NO: 4, or variants thereof.

In some cases, the concentration of a polypetide encoded by SEQ ID NO:5, or variants thereof, or SEQ ID NO: 4, or variants thereof, is atleast 1-, 2-, 3-, 4-, 5-, 10-, 20-, 25-, 30-, 40-, 50-, 60-, 70-, 80-,100-, 250-, 500-, 1000-, 5000-, 10,000-, 15,000-, 20,000-, 50,000-,100,000-, 500,000-, 10⁶, 10⁷, 10⁸, or 10⁹ fold less than theconcentration of a polymerase (e.g., Taq or Tgo DNA polymerase) used inthe amplification. Therefore, in some cases, a composition disclosedherein comprises a mixture of a polypeptide encoded by SEQ ID NO: 5, orvariants thereof, or a polypeptide encoded by SEQ ID NO: 4, or variantsthereof, and a polymerase (e.g., Taq or Tgo DNA polymerase), wherein theconcentration of a polypeptide encoded by SEQ ID NO: 5, or variantsthereof, or SEQ ID NO: 4, or variants thereof, is at least 1-, 2-, 3-,4-, 5-, 10-, 20-, 25-, 30-, 40-, 50-, 60-, 70-, 80-, 100-, 250-, 500-,1000-, 5000-, 10,000-, 15,000-, 20,000-, 50,000-, 100,000-, 500,000-,10⁶, 10⁷, 10⁸, or 10⁹ fold less than the concentration of the polymerase(e.g., Taq or Tgo DNA polymerase). All of the embodiments disclosedherein may also comprise polypeptides comprising a polypeptide encodedby SEQ ID NO: 7, or mutants, variants or fragment thereof such as DSPfragment depicted in the figures and/or a polypeptide or polypeptidescomprising a polypeptide encoded by SEQ ID NO: 3, or fragments, mutants,or variants thereof.

In some cases, the composition is a mixture of a polypeptide encoded bySEQ ID NO: 5, or fragments, mutants, or variants thereof, and apolypeptide encoded by SEQ ID NO: 4, or variants, mutants or fragmentsthereof. In some cases, such mixture also further includes aproofreading enzyme (e.g., Tgo DNA polymerase). Therefore, in some casesthe composition comprises a polymerase with 5′→3′ exonuclease activityas well as an enzyme with 3′→5′ proofreading activity.

In some cases the quantity of a polypeptide encoded by SEQ ID NO: 5, ormutants, fragments, or variants thereof, (or of a polypeptide comprisinga polypeptide encoded by SEQ ID NO: 7, or fragment thereof such as DSP)in such composition is at least 1-, 2-, 3-, 4-, 5-, 10-, 20-, 25-, 30-,40-, 50-, 60-, 70-, 80-, 100-, 250-, 500-, 1000-, 5000-, 10,000-,15,000-, 20,000-, 50,000-, 100,000-, 500,000-, 10⁶, 10⁷, 10⁸, or10⁹-fold less than the quantity of a polypeptide encoded by SEQ ID NO:4, or variants thereof, and/or of the proofreading enzyme.

In some cases, a polypeptide is translated as both a short form and along form. In some cases, a eukaryotic translation initiation factor isused to facilitate translation of a polypeptide. In some cases, thetranslation occurs in a bacterium.

A kit may comprise one or more compositions described herein as well asinstructions instructing the use of said composition. The instructionsmay include directions for formulating the reaction sample (includingthe relevant concentration of polymerase, template, primers (reverse andforward), dNTPs, BSA, and H20). The instructions may also includerecommendations running the PCR cycle, particularly the denaturation,annealing, and elongation phases. Such instructions may include thetemperature conditions and amount of time, or number of cycles, for eachstep. For example a recommendation for qPCR may be to have an initialdenaturation step at 95° C. 15 min (for example, to activate HOT startenzyme); followed by 40 cycles of the following steps: Denaturation 95°C. for 15 sec; Annealing 60°−65° C. for 20 sec; and Elongation 72° C.for 20 sec.

Nucleic Acid Vectors/Cells

The compositions disclosed herein also include nucleic acids and vectorsencoding any of the polypeptides described herein. Non-limiting examplesof such constructs include constructs comprising the nucleic acidsequence of SEQ ID NO: 3, 4, 5, 7, 9, and/or 12. Still other examplesinclude constructs comprising nucleic acids encoding polypeptides withthe amino acid sequences of SEQ ID NO: 1, 2, 6, 8, 10, 11, and/or 13.The compositions also include fragments, variants, and/or mutants of theforegoing.

The nucleic acid constructs may be composed of single-stranded DNA,double-stranded DNA, cDNA, RNA, cRNA. The nucleic acid vectors may beused in any known expression system, e.g., eukaryotic, prokaryotic, invitro, etc. In preferred embodiments, the nucleic acid vectors are usedin a prokaryotic system (e.g., E. Coli bacteria) and the nucleic acidvectors carry a strong eukaryotic translation signal. For example, thenucleic acid vectors may carry a strong eukaryotic translation signalsuch as a Kozak sequence GCCGCC(A/G)CCAUGG, as described in Nakagawa etal. (2007) Nuc. Acids Res. 1-11. (doi:10.1093/nar/gkm1102). For example,the vectors may include the sequence GCCGCCACCATGGTC. The vectors mayalso include a ribosome binding site (e.g., a sequence such as AGGA).The strong eukaryotic translation signal may enable translation ofmultiple peptides, starting at different met residues. For example avector containing a eukaryotic translation signal described herein and anucleic acid sequence encoding SEQ ID NO: 1, may express both the longform of the peptide (as shown in FIG. 2), as well as the short form ofthe peptide (FIG. 14, SEQ ID NO: 13). In this example, the short form ofthe peptide begins at residues MVDDL of the original sequence shown inSEQ ID NO: 1.

The inclusion of a strong eukaryotic signal in a nucleic acid vector mayresult in more yield of protein. Surprisingly, this may occur when sucha vector is used in a bacterial system, such as when it is expressed inE. Coli bacteria. As a result, the yield of protein isolated from thebacteria may be increased by more than 1.5-, 2-, 3-, 4-, 5-, 7-, 10-,15-, 20-, 30-, or 40-fold. The nucleic vectors may also containtranscriptional control regions known in the art (e.g., promoters,enhancers, operators, etc.).

In some embodiments, provided herein are cells that incorporate one ormore of the vectors of the invention. The cell may be a prokaryotic cellor a eukaryotic cell. The cell may be a bacterial cell (e.g. E. Coli).In some embodiments, the cell is a eukaryotic cell. In some embodiments,the cell is a mouse myeloma hybridoma cell. In some embodiments, thecell is a Chinese hamster ovary (CHO) cell. Any suitable techniques, asknown in the art, may be used to incorporate the vector(s) into thecell. The introduction of a nucleic vector may be by, e.g., permanentintegration into the chromsomal nucleic acid, or by, e.g., introductionof an episomal genetic element.

Methods

The compositions disclosed herein can be used in a number of methods.Given the many of the peptides, polypeptide, fusion polypeptide, andcompositions described herein have enhanced stability at warmertemperatures, they may be particularly useful for applications where itis not possible to refrigerate or freeze reagents or therapeutics. Thepeptide tags can thus be used in therapeutics, reagents, or diagnostics,designed for use in remote regions without reliable access toelectricity.

In preferred embodiments, the compositions are polymerases and are usedin nucleic acid amplifications, such as polymerase chain reaction (PCR).General procedures for PCR are taught in U.S. Pat. No. 4,683,195(Mullis) and U.S. Pat. No. 4,683,202 (Mullis et al.) and have beendescribed elsewhere herein. Briefly, amplification of nucleic acids byPCR involves repeated cycles of heat-denaturing the DNA, annealing twoprimers to sequences that flank the target nucleic acid segment to beamplified, and extending the annealed primers with a polymerase. Theprimers hybridize to opposite strands of the target nucleic acid and areoriented so that the synthesis by the polymerase proceeds across thesegment between the primers, effectively doubling the amount of thetarget segment. Moreover, because the extension products are alsocomplementary to and capable of binding primers, each successive cycleessentially doubles the amount of target nucleic acids synthesized inthe previous cycle. This results in exponential accumulation of thespecific target nucleic acids at approximately a rate of 2^(n), where nis the number of cycles.

A typical conventional PCR thermal cycling protocol comprises 30 cyclesof (a) denaturation at a range of 90° C. to 95° C. for 0.5 to 1 minute,(b) annealing at a temperature ranging from 50° C. to 65° C. for 1 to 2minutes, and (c) extension at 68° C. to 75° C. for at least 1 minute.Other protocols including but not limited to universal protocol as wellas fast cycling protocol can be performed the subject probes as well.

Another variation of the conventional PCR that can be performed with thecompositions provided herein is “nested PCR” using nested primers. Themethod is preferred when the amount of target nucleic acid in a sampleis extremely limited for example, where archival, forensic samples areused. In performing nested PCR, the nucleic acid is first amplified withan outer set of primers capable of hybridizing to the sequences flankinga larger segment of the target nucleic acid. This amplification reactionis followed by a second round of amplification cycles using an inner setof primers that hybridizes to target sequences within the large segment.

In some embodiments, compositions disclosed herein can be used in areverse-transcriptasae PCR reaction (RT-PCR), in which a reversetranscriptase first coverts RNA molecules to double stranded cDNAmolecules, which are then employed as the template for subsequentamplification in the polymerase chain reaction. In carrying out RT-PCR,the reverse transcriptase is generally added to the reaction sampleafter the target nucleic acids are heat denatured. The reaction is thenmaintained at a suitable temperature (e.g., 30° C.-45° C.) for asufficient amount of time (e.g., 5-60 minutes) to generate the cDNAtemplate before the scheduled cycles of amplification take place. Suchreaction is particularly useful for detecting the biological entitywhose genetic information is stored in RNA molecules.

In some embodiments, compositions provided herein can also be used inligase chain polymerase chain reaction (LCR-PCR). The method involvesligating the target nucleic acids to a set of primer pairs, each havinga target-specific portion and a short anchor sequence unrelated to thetarget sequences. A second set of primers containing the anchor sequenceis then used to amplify the target sequences linked with the first setof primers. Procedures for conducting LCR-PCR are well known to artisansin the field, and hence are not detailed herein (see, e.g., U.S. Pat.No. 5,494,810).

In addition, the products of a polymerase reaction can be analyzed byany other method known in the art, e.g., HRM, gel eletrophoresis,capillary electrophoresis.

qPCR

In some embodiments, polymerases described herein can also be used inquantitative polymerase chain reactions (qPCR). qPCR, also calledreal-time PCR, may be used to amplify and simultaneously quantify atargeted DNA molecule. qPCR resembles conventional PCR, except that theamplified DNA is detected as the reaction progresses in real time, asopposed to the end of the reaction. Two common methods for detection ofproducts in real-time PCR are: (1) non-specific fluorescent dyes thatintercalate with any double-stranded DNA (e.g., EvaGreen and other dyesdescribed herein), and (2) sequence-specific DNA probes consisting ofoligonucleotides that are labeled with a fluorescent reporter whichpermits detection only after hybridization of the probe with itscomplementary DNA target. TaqMan probes consist of a fluorophorecovalently attached to the 5′-end of the oligonucleotide probe and aquencher at the 3′-end. Several different fluorophores (e.g.6-carboxyfluorescein, acronym: FAM, or tetrachlorofluorescin, acronym:TET) and quenchers (e.g. tetramethylrhodamine, acronym: TAMRA, ordihydrocyclopyrroloindole tripeptide minor groove binder, acronym: MGB)are available. The quencher molecule quenches the fluorescence emittedby the fluorophore when excited by the cycler's light source via FRET(Fluorescence Resonance Energy Transfer). As long as the fluorophore andthe quencher are in proximity, quenching inhibits any fluorescencesignals.

In some embodiments, real-time PCR is combined with reversetranscription to quantify messenger RNA and Non-coding RNA in cells ortissues. In further embodiments, the present invention providesquantitative evaluation of the amplification process in real-time bymethods described herein. Evaluation of an amplification process in“real-time” involves determining the amount of amplicon in the reactionmixture either continuously or periodically during the amplificationreaction, and the determined values are used to calculate the amount oftarget sequence initially present in the sample. There are a variety ofmethods for determining the amount of initial target sequence present ina sample based on real-time amplification. These include those disclosedby Wittwer et al., “Method for Quantification of an Analyte,” U.S. Pat.No. 6,303,305, and Yokoyama et al., “Method for Assaying Nucleic Acid,”U.S. Pat. No. 6,541,205. Another method for determining the quantity oftarget sequence initially present in a sample, but which is not based ona real-time amplification, is disclosed by Ryder et al., “Method forDetermining Pre-Amplification Levels of a Nucleic Acid Target Sequencefrom Post-Amplification Levels of Product,” U.S. Pat. No. 5,710,029. Thepresent invention is particularly suited to real-time evaluation,because the production of side-products is decreased, diminished, orsubstantially eliminated.

Amplification products may be detected in real-time through the use ofvarious self-hybridizing probes, most of which have a stem-loopstructure. Such self-hybridizing probes are labeled so that they emitdifferently detectable signals, depending on whether the probes are in aself-hybridized state or an altered state through hybridization to atarget sequence.

Another example of a detection probe having self-complementarity is a“molecular beacon.” Molecular beacons include nucleic acid moleculeshaving a target complement sequence, an affinity pair (or nucleic acidarms) holding the probe in a closed conformation in the absence of atarget sequence present in an amplification product, and a label pairthat interacts when the probe is in a closed conformation. Hybridizationof the target sequence and the target complement sequence separates themembers of the affinity pair, thereby shifting the probe to an openconformation. The shift to the open conformation is detectable due toreduced interaction of the label pair, which may be, for example, afluorophore and a quencher (e.g., DABCYL and EDANS). Molecular beaconsare disclosed by Tyagi et al., “Detectably Labeled Dual ConfirmationOligonucleotide Probes, Assays and Kits,” U.S. Pat. No. 5,925,517, andTyagi et al., “Nucleic Acid Detection Probes Having Non-FRETFluorescence Quenching and Kits and Assays Including Such Probes,” U.S.Pat. No. 6,150,097, each of which is hereby incorporated by referenceherein in its entirety.

Other self-hybridizing probes for use in the present invention are wellknown to those of ordinary skill in the art. By way of example, probebinding pairs having interacting labels, such as those disclosed byMorrison, “Competitive Homogenous Assay,” U.S. Pat. No. 5,928,862 (thecontents of which are hereby incorporated by reference herein), might beadapted for use in the present invention. Probe systems used to detectsingle nucleotide polymorphisms (snps) might also be utilized in thepresent invention. Additional detection systems include “molecularswitches,” as disclosed by Arnold et al., “Oligonucleotides Comprising aMolecular Switch,” U.S. Provisional Application No. 60/467,517, whichenjoys common ownership with the present application and is herebyincorporated by reference herein in its entirety. And other probes, suchas those comprising intercalating dyes and/or fluorochromes, might beuseful for detection of amplification products in the present invention.See, e.g., Ishiguro et al., “Method of Detecting Specific Nucleic AcidSequences,” U.S. Pat. No. 5,814,447, the contents of which are herebyincorporated by reference herein.

In some embodiments, the signals produced in the qPCR reactionsdescribed herein may be detected in a variety of ways. Generally, achange of signal intensity can be detected by any methods known in theart and is generally dependent on the choice of fluorescent group used.It can be performed with the aid of an optical system. Such systemtypically comprises at least two elements, namely an excitation sourceand a photon detector. Numerous examples of these elements are availablein the art. An exemplary excitation source is a laser, such as apolarized laser. The choice of laser light will depend on thefluorescent group attached to the probe. For most of the fluorescentgroups, the required excitation light is within the range of about 300nm to about 1200 nm, or more commonly from about 350 nm to about 900 nm.Alternatively, compounds of the invention may be excited using anexcitation wavelength of about 300 to about 350 nm, 350 to 400 nm, 400to 450 nm, 450 to 500 nm, 500 to 550 nm, 550 to 600 nm, 600 to 650 nm,650 to 700 nm, 750 nm to 800 nm, or from 800 nm to 850 nm, merely by wayof example. Those skilled in the art can readily ascertain theappropriate excitation wavelength to excite a given fluorophore byroutine experimentation (see e.g., The Handbook—‘A Guide to FluorescentProbes and Labeling Technologies, Tenth Edition’ (2005) (available fromInvitrogen, Inc./Molecular Probes) previously incorporated herein byreference). Where desired, one can employ other optical systems. Theseoptical systems may comprise elements such as optical reader,high-efficiency photon detection system, photo multiplier tube, gatesensitive FET's, nano-tube FET's, photodiode (e.g. avalanche photodiodes (APD)), camera, charge couple device (CCD), electron-multiplyingcharge-coupled device (EMCCD), intensified charge coupled device (ICCD),and confocal microscope. These optical systems may also comprise opticaltransmission elements such as optic fibers, optical switches, mirrors,lenses (including microlens and nanolens), collimators. Other examplesinclude optical attenuators, polarization filters (e.g., dichroicfilter), wavelength filters (low-pass, band-pass, or high-pass),wave-plates, and delay lines. In some embodiments, the opticaltransmission element can be planar waveguides in optical communicationwith the arrayed optical confinements. See, e.g., U.S. Pat. Nos.7,292,742, 7,181,122, 7,013,054, 6,917,726, 7,267,673, and 7,170,050.These and other optical components known in the art can be combined andassembled in a variety of ways to effect detection of distinguishablesignals.

High Resolution Melt (HRM) analysis can also be used to detect andquantify amplified DNA following a PCR reaction using any of thepolymerases described herein. Nonlimiting examples of uses for HRMinclude: SNP typing/point mutation detection; zygosity testing at aparticular locus, and analyzing DNA methylation status.

The polymerases and other compositions described herein may be used in awide variety of molecular biology applications. Nonlimiting examplesinclude: sequencing reactions; cloning; mutagenesis; gene detection;point mutation detection; subtractive hybridization, and microarrays.

Methods of Manufacturing or Synthesis of Peptides, Polypeptides, orFusion Polypetides

Peptides, polypeptides, or fusion polypeptides provided herein may bemade using recombinant or synthetic techniques well known in the art. Inparticular, solid phase protein synthesis is well suited to therelatively short length of the peptides, polypeptides, or fusionpolypeptides and may provide greater yields with more consistentresults. Additionally, the solid phase protein synthesis may provideadditional flexibility regarding the manufacture of the peptides,polypeptides, or fusion polypeptides. For example, desired chemicalmodifications may be incorporated into the peptides, polypeptides, orfusion polypeptides at the synthesis stage: homocitrulline could be usedin the synthesis of the peptide as opposed to lysine, thereby obviatingthe need to carbamylate the peptide following synthesis.

Synthesis

In solid-phase synthesis of a peptide an amino acid with bothalpha-amino group and side chain protection is immobilized on a resin.See e.g., Nilsson, B., Soellner, M., and Raines, R. Chemical Synthesisof Proteins, Annu. Rev. Biomol. Struct. 2005. 34:91-118; Meldal M. 1997,Properties of solid supports. Methods Enzymol. 289:83-104; and SongsterM F, Barany G. 1997, Handles for solid-phase peptide synthesis, MethodsEnzymol. 289:126-74. Typically, two types of alpha-amino-protectinggroups are used: an acid-sensitive tert-butoxycarbonyl (Boc) group or abase-sensitive 9-fluorenylmethyloxycarbonyl (Fmoc) group. Wellings D A,Atherton E. 1997. Standard Fmoc protocols. Methods Enzymol. 289:44-67.After the quick and complete removal of these alpha-amino-protectinggroups another protected amino acid with an activated carboxyl group canthen be coupled to the unprotected resin-bound amine. By using an excessof activated soluble amino acid, the coupling reactions are forced tocompletion. The cycle of deprotection and coupling is repeated tocomplete the sequence. With side chain deprotection and cleavage, theresin yields the desired peptide. Guy C A, Fields G B. 1997,Trifluoroacetic acid cleavage and deprotection of resin-bound peptidesfollowing synthesis by Fmoc chemistry, Methods Enzymol. 289:67-83, andStewart J M. 1997, Cleavage methods following Boc-based solid-phasepeptide synthesis, Methods Enzymol. 289:29-44. Additional methods forperforming solid phase protein synthesis are disclosed in Bang, D. &Kent, S., 2004, A One-Pot Total Synthesis of Crambin, Angew. Chem. Int.Ed. 43:2534-2538; Bang, D., Chopra, N., & Kent, S. 2004, Total ChemicalSynthesis of Crambin., J Am. Chem. Soc. 126:1377-1383; Dawson, P. etal., 1994, Synthesis of Proteins by Native Chemical Ligation, Science,266:776-779; Kochendoerfer et al., 2003, Design and Chemical Synthesisof a Homogenous Polymer-Modified Erythropoiesis Protein, Science, 299:884-887.

If necessary, smaller peptides derived from solid phase peptidesynthesis may be combined through peptide ligations such as nativechemical ligation. In this process, the thiolate of an N-terminalcysteine residue of one peptide attacks the C-terminal thioester of asecond peptide to affect transthioesterification. An amide linkage formsafter rapid S.fwdarw.N acyl transfer. See Dawson, P. et al. 1994,Synthesis of Proteins by Native Chemical Ligation, Science, 266:776-779.

Further, peptides, polypeptides, or fusion polypeptides provided hereinmay encompass peptidomimetics, peptides including both naturallyoccurring and non-naturally occurring amino acids, such as peptoids.Peptoids are oligomers of N-substituted glycines, glycoholic acid,thiopronine, sarcosine, and thiorphan. These structures tend to have ageneral structure of (—(C═O)—CH₂—NR—)_(n) with the R group acting as theside chain. Such peptoids can be synthesized using solid phase synthesisin accordance with the protocols of Simon et al., Peptoids: A molecularapproach to drug discovery, Proc. Natl. Acad. Sci USA, 89:9367-9371(1992); and Li et al., Photolithographic Synthesis of Peptoids, J. AM.CHEM. SOC. 2004, 126, 4088-4089. Additionally, provided herein are usesof peptidomimetics or peptide mimetics, non-peptide drugs withproperties analogous to those of the template peptide. (Fauchere, J.(1986) Adv. Drug Res. 15:29; Veber and Friedinger (1985) TINS p. 32; andEvans et al. (1987) J. Med. Chem 30:1229). Synthesis of various types ofpeptidomimetics has been reviewed for example in: Methods of OrganicChemistry (Houben-Weyl), Synthesis of Peptides andPeptidomimetics—Workbench Edition Volume E22c (Editor-in-Chief GoodmanM.) 2004.

Recombinant Techniques

A variety of host-expression vector systems may be utilized to producethe peptides, polypeptides, or fusion polypeptides provided herein. Suchhost-expression systems represent vehicles by which the peptides,polypeptides, or fusion polypeptides of interest may be produced andsubsequently purified, but also represent cells that may, whentransformed or transfected with the appropriate nucleotide codingsequences, exhibit the modified gene product in situ. These include butare not limited to, bacteria, insect, plant, mammalian, including humanhost systems, such as, but not limited to, insect cell systems infectedwith recombinant virus expression vectors (e.g., baculovirus) containingthe peptide, polypeptide, or fusion polypeptide coding sequences; plantcell systems infected with recombinant virus expression vectors (e.g.,cauliflower mosaic virus, CaMV; tobacco mosaic virus, TMV) ortransformed with recombinant plasmid expression vectors (e.g., Tiplasmid) containing coding sequences; or mammalian cell systems,including human cell systems, e.g., HT1080, COS, CHO, BHK, 293, 3T3,harboring recombinant expression constructs containing promoters derivedfrom the genome of mammalian cells, e.g., metallothionein promoter, orfrom mammalian viruses, e.g., the adenovirus late promoter; the vacciniavirus 7.5K promoter.

In addition, a host cell strain may be chosen that modulates theexpression of the inserted sequences, or modifies and processes the geneproduct in the specific fashion desired. Such modifications andprocessing of protein products may be important for the function of theprotein. Different host cells have specific mechanisms for thepost-translational processing and modification of proteins and geneproducts. Appropriate cell lines or host systems can be chosen to ensurethe correct modification and processing of the foreign proteinexpressed. To this end, eukaryotic host cells that possess the cellularmachinery for proper processing of the primary transcript,glycosylation, and phosphorylation of the gene product may be used. Suchmammalian host cells, including human host cells, include but are notlimited to HT1080, CHO, VERO, BHK, HeLa, COS, MDCK, 293, 3T3, and W138.

For long-term, high-yield production of recombinant peptides, stableexpression is preferred. For example, cell lines that stably express therecombinant tissue protective cytokine-related molecule gene product maybe engineered. Rather than using expression vectors that contain viralorigins of replication, host cells can be transformed with DNAcontrolled by appropriate expression control elements, e.g., promoter,enhancer, sequences, transcription terminators, polyadenylation sites,and the like, and a selectable marker. Following the introduction of theforeign DNA, engineered cells may be allowed to grow for 1-2 days in anenriched media, and then are switched to a selective media. Theselectable marker in the recombinant plasmid confers resistance to theselection and allows cells to stably integrate the plasmid into theirchromosomes and grow to form foci that in turn can be cloned andexpanded into cell lines. This method may advantageously be used toengineer cell lines that express the tissue-protective product. Suchengineered cell lines may be particularly useful in screening andevaluation of compounds that affect the endogenous activity of the geneproduct.

Synthesis of Polynucleotides

Any known methods for synthesizing polynucleotides may be used. Solidphase synthesis disclosed by Caruthers et al. in U.S. Pat. No. 4,458,066may be used. In this technique, the growing DNA chain is attached to aninsoluble support via a long organic linker which allows the growing DNAchain to be solubilized in the solvent in which the support is placed.The solubilized, yet immobilized, DNA chain is thereby allowed to reactwith reagents in the surrounding solvent and allows for the easy washingaway of the reagents from the solid support to which the oligonucleotideis attached.

There are several sites on the nucleosides of similar chemical nature,e.g. —OH or hydroxyl groups. However, during oligonucleotide synthesis,the monomer subunits must be attached to the growing oligonucleotidemolecule in a site-specific manner. This requires functionalizing a siteeither on the growing chain or on the incoming base for attachment ofthe incoming monomer building block to the growing chain. To prevent theincoming monomer from attaching at the wrong site, the wrong sites mustbe blocked while the correct site is left open to react. This requiresthe use of protecting groups, which are compounds attached temporarilyto a potentially reactive site so as to prevent it from reacting. Theprotecting group must be stable during said reactions and yet musteventually be removed to yield the original site. The synthesis ofoligonucleotides requires several sites to be protected and particularsites must be deprotected while others remain protected. Theseprotecting groups grouped together as a set are termed orthogonalprotecting groups.

Solid phase oligonucleotide synthesis protocols typically use adimethoxytrityl protecting group for the 5′ hydroxyl of nucleosides. Aphosphoramidite functionality is utilized at the 3′ hydroxyl position.The synthesis generally proceeds from the 3′ to the 5′ of the ribose ordeoxyribose sugar component of the phosphoramidite nucleoside in asynthesis cycle which adds one nucleotide at a time to the growingoligonucleotide chain. Beaucage et al. (1981) Tetrahedron Lett. 22:1859.In the first step of the synthesis cycle, the “coupling” step, the 5′end of the growing chain is coupled with the 3′ phosphoramidite of theincoming monomer to form a phosphite triester intermediate (the 5′hydroxyl of the added monomer has a protecting group so only one newmonomer is added to the growing chain per cycle). Matteucci et al.(1981) J. Am. Chem. Soc. 103:3185. Next, an optional “capping reaction”is used to stop the synthesis on any chains having an unreacted 5′hydroxyl, which would be one nucleotide short at the end of synthesis.The phosphite triester intermediate is subjected to oxidation (the“oxidation” step) after each coupling reaction to yield a more stablephosphotriester intermediate. Without oxidation, the unstable phosphitetriester linkage would cleave under the acidic conditions of subsequentsynthesis steps. Letsinger et al. (1976) J. Am. Chem. Soc. 98:3655.Removal of the 5′ protecting group of the newly added monomer (the“deprotection” step) is typically accomplished by reaction with acidicsolution to yield a free 5′ hydroxyl group, which can be coupled to thenext protected nucleoside phosphoramidite. This process is repeated foreach monomer added until the desired sequence is synthesized.

According to some protocols, the synthesis cycle of couple, cap,oxidize, and deprotect is shortened by omitting the capping step or bytaking the oxidation step ‘outside’ of the cycle and performing a singleoxidation reaction on the completed chain. For example, oligonucleotidesynthesis according to H-phosphonate protocols will permit a singleoxidation step at the conclusion of the synthesis cycles. However,coupling yields are less efficient than those for phosphoramiditechemistry and oxidation requires longer times and harsher reagents thanamidite chemistry.

The chemical group conventionally used for the protection of nucleoside5′-hydroxyls is dimethoxytrityl (“DMT”), which is removable with acid.Khorana (1968) Pure Appl. Chem. 17:349; Smith et al. (1962) J. Am. Chem.Soc. 84:430. This acid-labile protecting group provides a number ofadvantages for working with both nucleosides and oligonucleotides. Forexample, the DMT group can be introduced onto a nucleosideregioselectively and in high yield. Brown et al. (1979) Methods inEnzymol. 68:109. Also, the lipophilicity of the DMT group greatlyincreases the solubility of nucleosides in organic solvents, and thecarbocation resulting from acidic deprotection gives a strongchromophore, which can be used to indirectly monitor couplingefficiency. Matteucci et al. (1980) Tetrahedron Lett. 21:719. Inaddition, the hydrophobicity of the group can be used to aid separationon reverse-phase HPLC. Becker et al. (1985) J. Chromatogr. 326:219.

Methods Related to Treating, Preventing or Diagnosing Disease

In some embodiments, polymerases disclosed herein can be used innumerous applications, including profiling gene expression, identifyingsequence variations, detecting microbes, and determining viral load.Given that they are stable at warmer temperatures, they may beparticularly useful in kits designed to diagnose disease in warmerclimates. In preferred embodiments, the polymerases are used to evaluateinfectious disease status (e.g., HIV-1, HIV-2, hepatitis viruses (e.g.,hep a, hep b, hep c), malaria) of a subject. The polymerases disclosedherein may be used in kits for detecting such pathogens as well as kitsdesigned to identify viral load. In other cases, the polymerases may beused to diagnose, treat, or provide a prognosis for genetic diseases(e.g., cancers, neurological diseases such as Alzheimer's Disease).

The polymerases may be used to evaluate, treat, diagnose infectionscaused by numerous viruses including: Abelson leukemia virus, Abelsonmurine leukemia virus, Abelson's virus, Acute laryngotracheobronchitisvirus, Adelaide River virus, Adeno associated virus group, Adenovirus,African horse sickness virus, African swine fever virus, AIDS virus,Aleutian mink disease parvovirus, Alpharetrovirus, Alphavirus, ALVrelated virus, Amapari virus, Aphthovirus, Aquareovirus, Arbovirus,Arbovirus C, arbovirus group A, arbovirus group B, Arenavirus group,Argentine hemorrhagic fever virus, Argentine hemorrhagic fever virus,Arterivirus, Astrovirus, Ateline herpesvirus group, Aujezky's diseasevirus, Aura virus, Ausduk disease virus, Australian bat lyssavirus,Aviadenovirus, avian erythroblastosis virus, avian infectious bronchitisvirus, avian leukemia virus, avian leukosis virus, avian lymphomatosisvirus, avian myeloblastosis virus, avian paramyxovirus, avianpneumoencephalitis virus, avian reticuloendotheliosis virus, aviansarcoma virus, avian type C retrovirus group, Avihepadnavirus,Avipoxvirus, B virus, B19 virus, Babanki virus, baboon herpesvirus,baculovirus, Barmah Forest virus, Bebaru virus, Berrimah virus,Betaretrovirus, Birnavirus, Bittner virus, BK virus, Black Creek Canalvirus, bluetongue virus, Bolivian hemorrhagic fever virus, Boma diseasevirus, border disease of sheep virus, borna virus, bovinealphaherpesvirus 1, bovine alphaherpesvirus 2, bovine coronavirus,bovine ephemeral fever virus, bovine immunodeficiency virus, bovineleukemia virus, bovine leukosis virus, bovine mammillitis virus, bovinepapillomavirus, bovine papular stomatitis virus, bovine parvovirus,bovine syncytial virus, bovine type C oncovirus, bovine viral diarrheavirus, Buggy Creek virus, bullet shaped virus group, Bunyamwera virussupergroup, Bunyavirus, Burkitt's lymphoma virus, Bwamba Fever, CAvirus, Calicivirus, California encephalitis virus, camelpox virus,canarypox virus, canid herpesvirus, canine coronavirus, canine distempervirus, canine herpesvirus, canine minute virus, canine parvovirus, CanoDelgadito virus, caprine arthritis virus, caprine encephalitis virus,Caprine Herpes Virus, Capripox virus, Cardiovirus, caviid herpesvirus 1,Cercopithecid herpesvirus 1, cercopithecine herpesvirus 1,Cercopithecine herpesvirus 2, Chandipura virus, Changuinola virus,channel catfish virus, Charleville virus, chickenpox virus, Chikungunyavirus, chimpanzee herpesvirus, chub reovirus, chum salmon virus, Cocalvirus, Coho salmon reovirus, coital exanthema virus, Colorado tick fevervirus, Coltivirus, Columbia SK virus, common cold virus, contagiousecthyma virus, contagious pustular dermatitis virus, Coronavirus,Corriparta virus, coryza virus, cowpox virus, coxsackie virus, CPV(cytoplasmic polyhedrosis virus), cricket paralysis virus, Crimean-Congohemorrhagic fever virus, croup associated virus, Cryptovirus, Cypovirus,Cytomegalovirus, cytomegalovirus group, cytoplasmic polyhedrosis virus,deer papillomavirus, deltaretrovirus, dengue virus, Densovirus,Dependovirus, Dhori virus, diploma virus, Drosophila C virus, duckhepatitis B virus, duck hepatitis virus 1, duck hepatitis virus 2,duovirus, Duvenhage virus, Deformed wing virus DWV, eastern equineencephalitis virus, eastern equine encephalomyelitis virus, EB virus,Ebola virus, Ebola-like virus, echo virus, echovirus, echovirus 10,echovirus 28, echovirus 9, ectromelia virus, EEE virus, EIA virus, EIAvirus, encephalitis virus, encephalomyocarditis group virus,encephalomyocarditis virus, Enterovirus, enzyme elevating virus, enzymeelevating virus (LDH), epidemic hemorrhagic fever virus, epizootichemorrhagic disease virus, Epstein-Barr virus, equid alphaherpesvirus 1,equid alphaherpesvirus 4, equid herpesvirus 2, equine abortion virus,equine arteritis virus, equine encephalosis virus, equine infectiousanemia virus, equine morbillivirus, equine rhinopneumonitis virus,equine rhinovirus, Eubenangu virus, European elk papillomavirus,European swine fever virus, Everglades virus, Eyach virus, felidherpesvirus 1, feline calicivirus, feline fibrosarcoma virus, felineherpesvirus, feline immunodeficiency virus, feline infectiousperitonitis virus, feline leukemia/sarcoma virus, feline leukemia virus,feline panleukopenia virus, feline parvovirus, feline sarcoma virus,feline syncytial virus, Filovirus, Flanders virus, Flavivirus, foot andmouth disease virus, Fort Morgan virus, Four Corners hantavirus, fowladenovirus 1, fowlpox virus, Friend virus, Gammaretrovirus, GB hepatitisvirus, GB virus, German measles virus, Getah virus, gibbon ape leukemiavirus, glandular fever virus, goatpox virus, golden shinner virus,Gonometa virus, goose parvovirus, granulosis virus, Gross' virus, groundsquirrel hepatitis B virus, group A arbovirus, Guanarito virus, guineapig cytomegalovirus, guinea pig type C virus, Hantaan virus, Hantavirus,hard clam reovirus, hare fibroma virus, HCMV (human cytomegalovirus),hemadsorption virus 2, hemagglutinating virus of Japan, hemorrhagicfever virus, hendra virus, Henipaviruses, Hepadnavirus, hepatitis Avirus, hepatitis B virus group, hepatitis C virus, hepatitis D virus,hepatitis delta virus, hepatitis E virus, hepatitis F virus, hepatitis Gvirus, hepatitis nonA nonB virus, hepatitis virus, hepatitis virus(nonhuman), hepatoencephalomyelitis reovirus 3, Hepatovirus, heronhepatitis B virus, herpes B virus, herpes simplex virus, herpes simplexvirus 1, herpes simplex virus 2, herpesvirus, herpesvirus 7, Herpesvirusateles, Herpesvirus hominis, Herpesvirus infection, Herpesvirus saimiri,Herpesvirus suis, Herpesvirus varicellae, Highlands J virus, Hiramerhabdovirus, hog cholera virus, human adenovirus 2, humanalphaherpesvirus 1, human alphaherpesvirus 2, human alphaherpesvirus 3,human B lymphotropic virus, human betaherpesvirus 5, human coronavirus,human cytomegalovirus group, human foamy virus, human gammaherpesvirus4, human gammaherpesvirus 6, human hepatitis A virus, human herpesvirus1 group, human herpesvirus 2 group, human herpesvirus 3 group, humanherpesvirus 4 group, human herpesvirus 6, human herpesvirus 8, humanimmunodeficiency virus, human immunodeficiency virus 1, humanimmunodeficiency virus 2, human papillomavirus, human T cell leukemiavirus, human T cell leukemia virus I, human T cell leukemia virus II,human T cell leukemia virus III, human T cell lymphoma virus I, human Tcell lymphoma virus II, human T cell lymphotropic virus type 1, human Tcell lymphotropic virus type 2, human T lymphotropic virus I, human Tlymphotropic virus II, human T lymphotropic virus III, Ichnovirus,infantile gastroenteritis virus, infectious bovine rhinotracheitisvirus, infectious haematopoietic necrosis virus, infectious pancreaticnecrosis virus, influenza virus A, influenza virus B, influenza virus C,influenza virus D, influenza virus pr8, insect iridescent virus, insectvirus, iridovirus, Japanese B virus, Japanese encephalitis virus, JCvirus, Junin virus, Kaposi's sarcoma-associated herpesvirus, Kemerovovirus, Kilham's rat virus, Klamath virus, Kolongo virus, Koreanhemorrhagic fever virus, kumba virus, Kysanur forest disease virus,Kyzylagach virus, La Crosse virus, lactic dehydrogenase elevating virus,lactic dehydrogenase virus, Lagos bat virus, Langur virus, lapineparvovirus, Lassa fever virus, Lassa virus, latent rat virus, LCM virus,Leaky virus, Lentivirus, Leporipoxvirus, leukemia virus, leukovirus,lumpy skin disease virus, lymphadenopathy associated virus,Lymphocryptovirus, lymphocytic choriomeningitis virus,lymphoproliferative virus group, Machupo virus, mad itch virus,mammalian type B oncovirus group, mammalian type B retroviruses,mammalian type C retrovirus group, mammalian type D retroviruses,mammary tumor virus, Mapuera virus, Marburg virus, Marburg-like virus,Mason Pfizer monkey virus, Mastadenovirus, Mayaro virus, ME virus,measles virus, Menangle virus, Mengo virus, Mengovirus, Middelburgvirus, milkers nodule virus, mink enteritis virus, minute virus of mice,MLV related virus, MM virus, Mokola virus, Molluscipoxvirus, Molluscumcontagiosum virus, monkey B virus, monkeypox virus, Mononegavirales,Morbillivirus, Mount Elgon bat virus, mouse cytomegalovirus, mouseencephalomyelitis virus, mouse hepatitis virus, mouse K virus, mouseleukemia virus, mouse mammary tumor virus, mouse minute virus, mousepneumonia virus, mouse poliomyelitis virus, mouse polyomavirus, mousesarcoma virus, mousepox virus, Mozambique virus, Mucambo virus, mucosaldisease virus, mumps virus, murid betaherpesvirus 1, muridcytomegalovirus 2, murine cytomegalovirus group, murineencephalomyelitis virus, murine hepatitis virus, murine leukemia virus,murine nodule inducing virus, murine polyomavirus, murine sarcoma virus,Muromegalovirus, Murray Valley encephalitis virus, myxoma virus,Myxovirus, Myxovirus multiforme, Myxovirus parotitidis, Nairobi sheepdisease virus, Nairovirus, Nanirnavirus, Nariva virus, Ndumo virus,Neethling virus, Nelson Bay virus, neurotropic virus, New WorldArenavirus, newborn pneumonitis virus, Newcastle disease virus, Nipahvirus, noncytopathogenic virus, Norwalk virus, nuclear polyhedrosisvirus (NPV), nipple neck virus, O'nyong'nyong virus, Ockelbo virus,oncogenic virus, oncogenic viruslike particle, oncornavirus, Orbivirus,Orf virus, Oropouche virus, Orthohepadnavirus, Orthomyxovirus,Orthopoxvirus, Orthoreovirus, Orungo, ovine papillomavirus, ovinecatarrhal fever virus, owl monkey herpesvirus, Palyam virus,Papillomavirus, Papillomavirus sylvilagi, Papovavirus, parainfluenzavirus, parainfluenza virus type 1, parainfluenza virus type 2,parainfluenza virus type 3, parainfluenza virus type 4, Paramyxovirus,Parapoxvirus, paravaccinia virus, Parvovirus, Parvovirus B19, parvovirusgroup, Pestivirus, Phlebovirus, phocine distemper virus, Picodnavirus,Picornavirus, pig cytomegalovirus-pigeonpox virus, Piry virus, Pixunavirus, pneumonia virus of mice, Pneumovirus, poliomyelitis virus,poliovirus, Polydnavirus, polyhedral virus, polyoma virus, Polyomavirus,Polyomavirus bovis, Polyomavirus cercopitheci, Polyomavirus hominis 2,Polyomavirus maccacae 1, Polyomavirus muris 1, Polyomavirus muris 2,Polyomavirus papionis 1, Polyomavirus papionis 2, Polyomavirussylvilagi, Pongine herpesvirus 1, porcine epidemic diarrhea virus,porcine hemagglutinating encephalomyelitis virus, porcine parvovirus,porcine transmissible gastroenteritis virus, porcine type C virus, poxvirus, poxvirus, poxvirus variolae, Prospect Hill virus, Provirus,pseudocowpox virus, pseudorabies virus, psittacinepox virus, quailpoxvirus, rabbit fibroma virus, rabbit kidney vaculolating virus, rabbitpapillomavirus, rabies virus, raccoon parvovirus, raccoonpox virus,Ranikhet virus, rat cytomegalovirus, rat parvovirus, rat virus,Rauscher's virus, recombinant vaccinia virus, recombinant virus,reovirus, reovirus 1, reovirus 2, reovirus 3, reptilian type C virus,respiratory infection virus, respiratory syncytial virus, respiratoryvirus, reticuloendotheliosis virus, Rhabdovirus, Rhabdovirus carpia,Rhadinovirus, Rhinovirus, Rhizidiovirus, Rift Valley fever virus,Riley's virus, rinderpest virus, RNA tumor virus, Ross River virus,Rotavirus, rougeole virus, Rous sarcoma virus, rubella virus, rubeolavirus, Rubivirus, Russian autumn encephalitis virus, SA 11 simian virus,SA2 virus, Sabia virus, Sagiyama virus, Saimirine herpesvirus 1,salivary gland virus, sandfly fever virus group, Sandjimba virus, SARSvirus, SDAV (sialodacryoadenitis virus), sealpox virus, Semliki ForestVirus, Seoul virus, sheeppox virus, Shope fibroma virus, Shope papillomavirus, simian foamy virus, simian hepatitis A virus, simian humanimmunodeficiency virus, simian immunodeficiency virus, simianparainfluenza virus, simian T cell lymphotrophic virus, simian virus,simian virus 40, Simplexvirus, Sin Nombre virus, Sindbis virus, smallpoxvirus, South American hemorrhagic fever viruses, sparrowpox virus,Spumavirus, squirrel fibroma virus, squirrel monkey retrovirus, SSV 1virus group, STLV (simian T lymphotropic virus) type I, STLV (simian Tlymphotropic virus) type II, STLV (simian T lymphotropic virus) typeIII, stomatitis papulosa virus, submaxillary virus, suidalphaherpesvirus 1, suid herpesvirus 2, Suipoxvirus, swamp fever virus,swinepox virus, Swiss mouse leukemia virus, TAC virus, Tacaribe complexvirus, Tacaribe virus, Tanapox virus, Taterapox virus, Tench reovirus,Theiler's encephalomyelitis virus, Theiler's virus, Thogoto virus,Thottapalayam virus, Tick borne encephalitis virus, Tioman virus,Togavirus, Torovirus, tumor virus, Tupaia virus, turkey rhinotracheitisvirus, turkeypox virus, type C retroviruses, type D oncovirus, type Dretrovirus group, ulcerative disease rhabdovirus, Una virus, Uukuniemivirus group, vaccinia virus, vacuolating virus, varicella zoster virus,Varicellovirus, Varicola virus, variola major virus, variola virus,Vasin Gishu disease virus, VEE virus, Venezuelan equine encephalitisvirus, Venezuelan equine encephalomyelitis virus, Venezuelan hemorrhagicfever virus, vesicular stomatitis virus, Vesiculovirus, Vilyuisk virus,viper retrovirus, viral haemorrhagic septicemia virus, Visna Maedivirus, Visna virus, volepox virus, VSV (vesicular stomatitis virus),Wallal virus, Warrego virus, wart virus, WEE virus, West Nile virus,western equine encephalitis virus, western equine encephalomyelitisvirus, Whataroa virus, Winter Vomiting Virus, woodchuck hepatitis Bvirus, woolly monkey sarcoma virus, wound tumor virus, WRSV virus, Yabamonkey tumor virus, Yaba virus, Yatapoxvirus, yellow fever virus, andthe Yug Bogdanovac virus.

EXAMPLES Example 1: Amplification of Barley Genomic DNA

FIG. 15 shows results from amplification of barley genomic DNA usingPeptide Tag-Polymerase (an aldehyde-modified form of SEQ ID NO:2)polymerase in comparison with two commercially available polymerases(ABI Gold™ and Roche FastStart™).

Barley genomic DNA was obtained from 5 different genomes. Lane 7 shows a100 bp ladder.

A description of the lanes is given below:

Lanes 1-3—Peptide Tag-Polymerase at 2.5 U/100 μl PCR reaction PeptideTag-Polymerase 10× fold dilution starting from 1 ng/μl

Lanes 4-6—Peptide Tag-Polymerase at 4 U/100 μl

Lanes 8-10 ABI Gold™ at 2.5 U/100 μl PCR reaction ABI Gold™ 10× folddilution starting from 1 ng/μl;

Lanes 11-13—ABI Gold™ at 4 U/100 μl

Lanes 14-16 Roche FastStart™ at 2.5 U/100 μl PCR reaction ABI Gold™ 10×fold dilution starting from 1 ng/μl

Lanes 17-19 Roche FastStart™ at 4 U/100 μl

Lane 1-3 (Peptide Tag-Polymerase) 2.5 U only—shows amplification of DNA

Lane 8 (ABI) shows small nonspecific product (smear)

Lanes 14-16—(Roche) shows no product at 2.5 U

Example 2: Amplification of Mouse Genomic DNA Using PeptideTag-Polymerase Mixtures

FIG. 26 shows results from amplification of mouse genomic DNA using avariety of polymerase mixes. Lane 1 shows a 1 kb DNA ladder. Lanes 2-10show amplification of mouse genomic DNA, 1 ng/μl in 1×PCR. All mixesproduced correct PCR product length of 3838 bp. Lanes 11-19 correspondto the same mixes as in Lanes 2-10 but using plasmid template DNA. Theexpected PCR product was ˜8.6 kb and only one Peptide Tag-PolymeraseMixture 1.5 mM MgCl₂ (Lane 17) produced the correct product. The PeptideTag-Polymerase Mixture contains aldehyde-modified forms of SEQ ID NO:2,SEQ ID NO:6 and SEQ ID NO:11. Lanes 20 and 21 show PCR of PeptideTag-Polymerase only, which produced no product. A description of thelanes is given below:

Lane 1—1 kb DNA ladder.

Lane 2—Peptide Tag-Polymerase Mixture with bovine serum albumin (BSA)Ready to Load (1.5 mM MgCl₂)

Lane 3—Peptide Tag-Polymerase Mixture with BSA Ready to Load (2.0 mMMgCl₂)

Lane 4—Peptide Tag-Polymerase Mixture with BSA Ready to Load (2.5 mMMgCl₂)

Lane 5—Peptide Tag-Polymerase Mixture with BSA (1.5 mM MgCl₂)

Lane 6—Peptide Tag-Polymerase Mixture with BSA (2.0 mM MgCl₂)

Lane 7—Peptide Tag-Polymerase Mixture with BSA (2.5 mM MgCl₂)

Lane 8—Peptide Tag-Polymerase Mixture 1.5 mM MgCl₂

Lane 9—Peptide Tag-Polymerase Mixture 2.0 mM MgCl₂

Lane 10—Peptide Tag-Polymerase Mixture 2.5 mM MgCl₂

Lane 20—Peptide Tag-Polymerase (1.5 mM MgCl₂) mouse genomic DNA, 1 ng/μlin 1×PCR.

Lane 21—Peptide Tag-Polymerase (1.5 mM MgCl₂) plasmid DNA.

While preferred embodiments of the present invention have been shown anddescribed herein, it will be obvious to those skilled in the art thatsuch embodiments are provided by way of example only. Numerousvariations, changes, and substitutions will now occur to those skilledin the art without departing from the invention. It should be understoodthat various alternatives to the embodiments of the invention describedherein may be employed in practicing the invention. It is intended thatthe following claims define the scope of the invention and that methodsand structures within the scope of these claims and their equivalents becovered thereby.

1. A method of increasing thermal stability of a polypeptide comprisinglinking a peptide tag that has an amino acid sequence that is at least70% identical to SEQ ID NO: 1 with a polypeptide to form a fusionpolypeptide, wherein the fusion polypeptide does not have an amino acidsequence as shown in SEQ ID NO:
 2. 2. The method of claim 1, wherein thepeptide tag has an amino acid sequence as shown in SEQ ID NO: 1, SEQ IDNO: 13, SEQ ID NO: 14, SEQ ID NO: 16, or SEQ ID NO:
 18. 3-11. (canceled)12. The method of claim 1, wherein the peptide tag is covalently linkedto the polypeptide.
 13. The method of claim 1, wherein the peptide tagis non-covalently linked to the polypeptide. 14-15. (canceled)
 16. Themethod of claim 1, wherein the polypeptide is erythropoietin, humanLeukemia Inhibitory Factor (hLIF), granulocyte macrophagecolony-stimulating factor (GM-CSF), insulin, vascular endothelial growthfactor (VEGF), leptin, or bevacizumab. 17-20. (canceled)
 21. The methodof claim 1, wherein the polypeptide comprises a polymerase, reversetranscriptase, nuclease, pyrophosphatase, deaminase, or protease. 22.The method of claim 1, wherein the polypeptide is a polymerase.
 23. Themethod of claim 1, wherein the polypeptide comprises a Bacillusstearothermophilus (Bst) polymerase or fragment thereof. 24-45.(canceled)
 46. A fusion polypeptide comprising a) a peptide tag that hasan amino acid sequence that is at least 70% identical to SEQ ID NO: 1;and b) at least one polypeptide; wherein the peptide tag is linked tothe at least one polypeptide, the peptide tag increases thermalstability of the at least one polypeptide, and the fusion polypeptidedoes not have an amino acid sequence as shown in SEQ ID NO:
 2. 47. Thefusion polypeptide of claim 46, wherein the peptide tag is covalentlylinked to the at least one polypeptide.
 48. (canceled)
 49. The fusionpolypeptide of claim 46, wherein the peptide tag is linked to an aminoterminus or an carboxy terminus of the at least one polypeptide. 50-75.(canceled)
 76. A fusion polypeptide comprising: (a) a peptide tag thathas an amino acid sequence that is at least 70% identical to SEQ ID NO:1; and (b) a polypeptide comprising a polymerase with stranddisplacement activity.
 77. The fusion polypeptide of claim 76, whereinthe polymerase with strand displacement activity comprises Bacillusstearothermophilus (Bst) polymerase polypeptide.
 78. The fusionpolypeptide of claim 76, wherein the polymerase with strand displacementactivity is a fragment of Bacillus stearothermophilus (Bst) polymerase.79. The fusion polypeptide of claim 76, wherein the peptide tagincreases thermal stability of the polypeptide comprising the polymerasewith the strand displacement activity.
 80. The fusion polypeptide ofclaim 79, wherein the thermal stability of the polypeptide comprisingthe polymerase with the strand displacement activity is increased for atleast 1 day at a temperature between 20° C. and 50° C.
 81. The fusionpolypeptide of claim 76, wherein the peptide tag inhibits loss ofprotein function of the polypeptide comprising the polymerase with thestrand displacement activity for at least 1 day after exposure to atemperature of at least 35° C.